hanaml.Get.Relevant.Doc is a R wrapper for SAP HANA PAL get relevant doc algorithm.

hanaml.Get.Relevant.Doc(
  pred.data,
  ref.data = NULL,
  top = NULL,
  threshold = NULL
)

Arguments

pred.data

DataFrame
The prediction data for classification.

ref.data

DataFrame, optional
The reference data for classification.

top

integer, optional
Only show top N results. If 0, it shows all.

threshold

double, optional
Only the results which score bigger than this value will be put into the result table.

Value

DataFrame
DataFrame of get related doc result.

Examples

Input DataFrame data:

> ref_df$collect()
         ID                                                  CONTENT       CATEGORY
   0   doc1                      term1 term2 term2 term3 term3 term3     CATEGORY_1
   1   doc2                      term2 term3 term3 term4 term4 term4     CATEGORY_1
   2   doc3                      term3 term4 term4 term5 term5 term5     CATEGORY_2
   3   doc4    term3 term4 term4 term5 term5 term5 term5 term5 term5     CATEGORY_2
   4   doc5                                              term4 term6     CATEGORY_3
   5   doc6                                  term4 term6 term6 term6     CATEGORY_3
> pred_df$collect()
      CONTENT
  0     term3

Call the function:

> result <- hanaml.Get.Relevant.Doc(pred_df, ref_df)

Output:

> result$Collect()
         ID       SCORE
  0    doc1    0.774597
  1    doc2    0.516398
  2    doc3    0.258199
  3    doc4    0.258199