hanaml.NaiveBayes {hana.ml.r} | R Documentation |
hanaml.NaiveBayes is a R wrapper for PAL Naive Bayes.
hanaml.NaiveBayes(conn.context, data = NULL, key = NULL, features = NULL, formula = NULL, label = NULL, alpha =NULL, discretization = NULL, model.format = NULL, categorical.variable = NULL, thread.ratio = NULL)
conn.context |
|
data |
|
key |
|
features |
|
formula |
|
label |
|
alpha |
|
discretization |
|
model.format |
|
categorical.variable |
|
thread.ratio |
|
R6Class
object.
Naive Bayes is a classification algorithm based on Bayes theorem. It estimates the class-conditional probability by assuming that the attributes are conditionally independent of one another.
Return a "NaiveBayes" object with following values:
model: DataFrame
Naive Bayes model infomation.
statistics: DataFrame
Statistics infomation.
The Laplace value (alpha) is only stored by JSON format models. If the PMML format is chosen, you may need to set the Laplace value (alpha) again in predict() and score().
## Not run: Input DataFrame df for training the model: > df$collect() ID HOMEOWNER MARITALSTATUS ANNUALINCOME DEFAULTEDBORROWER 0 YES Single 125.0 NO 1 NO Married 100.0 NO 2 NO Single 70.0 NO 3 YES Married 120.0 NO 4 NO Divorced 95.0 YES 5 NO Married 60.0 NO 6 YES Divorced 220.0 NO 7 NO Single 85.0 YES 8 NO Married 75.0 NO 9 NO Single 90.0 YES Training the model: > nb <- hanaml.NaiveBayes(conn.context = conn, data = df, alpha = 1.0, model.format = "pmml", thread.ratio = 0.2, features = list('HOMEOWNER', 'MARITALSTATUS', 'ANNUALINCOME'), label = "DEFAULTEDBORROWER") Calculating Mean accuracy on the given test data and labels can be done using score function. > nb$score(nb, df1, "ID", alpha=1.0, verbose=True) Output: {0.875} Double value - Mean accuracy on the given test data and labels. ## End(Not run)