hanaml.Text.Classification.Rdhanaml.Text.Classification is a R wrapper for SAP HANA PAL Text Classification algorithm.
hanaml.Text.Classification( pred.data, ref.data = NULL, k.nearest.neighbours = NULL, thread.ratio = NULL )
| pred.data |
|
|---|---|
| ref.data |
|
| k.nearest.neighbours |
|
| thread.ratio |
|
List of DataFrames
DataFrames of text classification results:
DataFrame 1: Text classification result
DataFrame 2: Statistics table
Input DataFrame data:
> data$collect()
ID CONTENT CATEGORY
0 doc1 term1 term2 term2 term3 term3 term3 CATEGORY_1
1 doc2 term2 term3 term3 term4 term4 term4 CATEGORY_1
2 doc3 term3 term4 term4 term5 term5 term5 CATEGORY_2
3 doc4 term3 term4 term4 term5 term5 term5 term5 term5 term5 CATEGORY_2
4 doc5 term4 term6 CATEGORY_3
5 doc6 term4 term6 term6 term6 CATEGORY_3
Call the function:
> result <- hanaml.Text.Classification(data$Select(data$columns[0], data$columns[1]), data)
Output:
> result[[1]]$head(1)$Collect()
ID TARGET
0 doc1 CATEGORY_1