hanaml.Text.Collector is a R wrapper for SAP HANA PAL text collector algorithm.

hanaml.Text.Collector(data)

Arguments

data

DataFrame
Data to be analysis.

Value

DataFrame

  • DataFrame 1: Inverse document frequency of documents.

  • DataFrame 2: Extended table.

Examples

Input DataFrame data:

> data$collect()
       ID	                            CONTENT
  0	doc1	term1 term2 term2 term3 term3 term3
  1	doc2	term2 term3 term3 term4 term4 term4
  2	doc3	term3 term4 term4 term5 term5 term5
  3	doc5	term3 term4 term4 term5 term5 term5 term5 term5 term5
  4	doc4	term4 term6
  5	doc6	term4 term6 term6 term6

Call the function:

> result <- hanaml.Text.Collector(data)

Output:

> result[[1]]$Collect()
     TM_TERMS	TM_TERM_IDF_VALUE
  0	   term1	         1.791759
  1	   term2	         1.098612
  2	   term3	         0.405465
  3	   term4	         0.182322
  4	   term5	         1.098612
  5	   term6	         1.098612