hanaml.Text.TFIDF.Rd
hanaml.Text.TFIDF is a R wrapper for SAP Text Mining text tfidf algorithm.
hanaml.Text.TFIDF(data, idf = NULL)
DataFrame
Data to be analysis.
DataFrame, optional
Inverse document frequency of documents.
DataFrame
Inverse document frequency of documents.
Input DataFrame data:
> data$collect()
ID CONTENT
1 doc1 term1 term2 term2 term3 term3 term3
2 doc2 term2 term3 term3 term4 term4 term4
3 doc3 term3 term4 term4 term5 term5 term5
4 doc5 term3 term4 term4 term5 term5 term5 term5 term5 term5
5 doc4 term4 term6
6 doc6 term4 term6 term6 term6
Call the function:
> result <- hanaml.Text.Collector(data)
> tfidf <- hanaml.Text.TFIDF(data, result[[1]])
Output:
> tfidf$Head(3)$Collect()
ID TERMS TF_VALUE TFIDF_VALUE
1 doc1 term1 1.0 1.791759
2 doc1 term2 2.0 2.197225
3 doc1 term3 3.0 1.216395