transform.LatentDirichletAllocation {hana.ml.r} | R Documentation |
Similar to other predict methods, this function predicts fitted values from a fitted "LatentDirichletAllocation" object.
## S3 method for class 'LatentDirichletAllocation' transform(model, data, key, document = NULL, burn.in = NULL, iteration = NULL, thin = NULL, seed = NULL, gibbs.init = NULL, delimiters = NULL, output.word.assignment = NULL)
model |
|
data |
|
key |
|
document |
|
burn.in |
|
iteration |
|
thin |
|
seed |
Indicates the seed used to initialize the random number generator. |
gibbs.init |
|
delimiters |
|
output.word.assignment |
|
Predicted values are returned as a list of DataFrame, structured as follows:
DataFrame 1:
Document-topic distribution table, structured as follows:
Document ID column
: with same name and type as data's
document ID column.
TOPIC_ID
: type INTEGER, topic ID.
PROBABILITY
: type DOUBLE, probability of topic given document.
DataFrame 2:
Word-topic assignment table, structured as follows:
Document ID column
:with same name and type as data's
document ID column.
WORD_ID
:type INTEGER, word ID.
TOPIC_ID
: type INTEGER, topic ID.
DataFrame 3:
Statistics table, structured as follows:
STAT_NAME
: type NVARCHAR(256), statistic name.
STAT_VALUE
: type NVARCHAR(1000), statistic value.
hanaml.LatentDirichletAllocation
## Not run: Perform the predict on DataFrame data1 using "LatentDirichletAllocation" object LDA: > data1$Collect() DOCUMENT_ID TEXT 1 10 toy toy spoon cpu > result <- transform(LDA, pred.data, key = "DOCUMENT_ID", document = "TEXT", burn.in = 2000, iteration = 1000, thin = 100, seed = 1, output.word.assignment = TRUE) > result[[1]]$Collect() DOCUMENT_ID TOPIC_ID PROBABILITY 1 10 0 0.23913043478260873 2 10 1 0.4565217391304348 3 10 2 0.02173913043478261 4 10 3 0.02173913043478261 5 10 4 0.23913043478260873 6 10 5 0.02173913043478261 ## End(Not run)