| transform.LatentDirichletAllocation {hana.ml.r} | R Documentation |
Similar to other predict methods, this function predicts fitted values from a fitted "LatentDirichletAllocation" object.
## S3 method for class 'LatentDirichletAllocation' transform(model, data, key, document = NULL, burn.in = NULL, iteration = NULL, thin = NULL, seed = NULL, gibbs.init = NULL, delimiters = NULL, output.word.assignment = NULL)
model |
|
data |
|
key |
|
document |
|
burn.in |
|
iteration |
|
thin |
|
seed |
Indicates the seed used to initialize the random number generator. |
gibbs.init |
|
delimiters |
|
output.word.assignment |
|
Predicted values are returned as a list of DataFrame, structured as follows:
DataFrame 1:
Document-topic distribution table, structured as follows:
Document ID column: with same name and type as data's
document ID column.
TOPIC_ID: type INTEGER, topic ID.
PROBABILITY: type DOUBLE, probability of topic given document.
DataFrame 2:
Word-topic assignment table, structured as follows:
Document ID column:with same name and type as data's
document ID column.
WORD_ID:type INTEGER, word ID.
TOPIC_ID: type INTEGER, topic ID.
DataFrame 3:
Statistics table, structured as follows:
STAT_NAME: type NVARCHAR(256), statistic name.
STAT_VALUE: type NVARCHAR(1000), statistic value.
hanaml.LatentDirichletAllocation
## Not run:
Perform the predict on DataFrame data1 using "LatentDirichletAllocation" object LDA:
> data1$Collect()
DOCUMENT_ID TEXT
1 10 toy toy spoon cpu
> result <- transform(LDA, pred.data, key = "DOCUMENT_ID",
document = "TEXT", burn.in = 2000,
iteration = 1000, thin = 100,
seed = 1, output.word.assignment = TRUE)
> result[[1]]$Collect()
DOCUMENT_ID TOPIC_ID PROBABILITY
1 10 0 0.23913043478260873
2 10 1 0.4565217391304348
3 10 2 0.02173913043478261
4 10 3 0.02173913043478261
5 10 4 0.23913043478260873
6 10 5 0.02173913043478261
## End(Not run)