Make Predictions from a

Similar to other predict methods, this function predicts fitted values from a fitted "UnifiedClassification" object.

# S3 method for UnifiedClassification
predict(
  model,
  data,
  key,
  features = NULL,
  thread.ratio = NULL,
  verbose = NULL,
  func = NULL,
  multi.class = NULL,
  alpha = NULL,
  block.size = NULL,
  missing.replacement = NULL,
  class.map0 = NULL,
  class.map1 = NULL,
  categorical.variable = NULL,
  attribution.method = NULL,
  top.k.attributions = NULL,
  sample.size = NULL,
  random.state = NULL
)

Arguments

model	`R6Class` A "UnifiedClassification" object for prediction.
data	`DataFrame` DataFrame containting the data.
key	`character` Name of the ID column.
features	`character of list of characters, optional` Name of feature columns for prediction. If not provided, it defaults to all non-key columns of data.
thread.ratio	`double, optional` Controls the proportion of available threads that can be used by this function. The value range is from 0 to 1, where 0 indicates a single thread, and 1 indicates all available threads. Values between 0 and 1 will use up to that percentage of available threads. Values outside the range from 0 to 1 are ignored, and the actual number of threads used is then be heuristically determined. Defaults to -1.
verbose	`logical, optional` If TRUE, output all classes and the corresponding confidences for each data point. Defaults to FALSE.
func	`character, optional` The functionality for unified classification model. Mandatory only when the func attribute of model is NULL. Valid values are as follows: "DecisionTree", "RandomDecisionTrees", "HGBT", "LogisticRegression", "NaiveBayes", "SVM", "MLP". Defaults to `model$func`.
multi.class	`logical, optional` If the functionality of the unified classification model is LogisticRegression, then this parameter indicates whether or not the classification mdoel is binary-class case or multiple-class case. Valid only when func is set to be "LogisticRegression".
alpha	`double, optional` Specifies the value of Laplace smoothing. A positive value will enable Laplace smoothing for categorical variables with that value being the smoothing parameter. Set the value to 0 to disable Laplace smoothing . Defaults to the alpha value in the JSON model if there is one, and 0 otherwise.
block.size	`integer, optional` Specifies the number of data loaded per time during scoring. 0: load all data once Other positive Values: the specified number Valid only when `func` is "RandomDecisionTrees"(case insensitive). Defaults to 0.
missing.replacement	`character, optional` Specifies the strategy for replacement of missing values in prediction data. 'feature.marginalized': marginalizes each missing feature out independently 'instance.marginalized': marginalizes all missing features in an instance as a whole corresponding to each category Valid only when `func` is "RandomDecisionTrees" or "HGBT". Defaults to 'feature.marginalized'.
class.map0	`character, optional` Specifies the label value which will be mapped to 0 in logistic regression. Mandatory and valid only for logistic regression models when the label variable is of type VARCHAR or NVARCHAR. Defaults to the value of `class.map0` in the model training phase.
class.map1	`character, optional` Specifies the label value which will be mapped to 1 in logistic regression. Mandatory and valid only for logistic regression models when the label variable is of type VARCHAR or NVARCHAR. Defaults to the value of `class.map1` in the model training phase.
categorical.variable	`character or list of characters, optional` Indicates features that should be treated as categorical variable. The behavior is dependent on what input is given: "VARCHAR" and "NVARCHAR": categorical. "INTEGER" and "DOUBLE": continuous. VALID only for variables of type "INTEGER",omitted otherwise. Default to the value of `categorical.variable` in the model training phase.
attribution.method	`character, optional` Specifies which method to use in model reasoning: "no": no reasoning "saabas": SAABAS reasoning "shap": SHAP reasoning Valid only for tree-based classification models. Defaults to "shap".
top.k.attributions	`character, optional` Output the attributions of top k features which contribute the most. Defaults to 10.
sample.size	`integer, optional` Specifies the number of sampled combinations of features. If set to 0, the value is determined by algorithm heuristically. Valid only when the trained classification model is for Naive Bayes, Support Vector Machine(SVM), Multilayer Perceptron or Multi-class Logistic Regression. Defaults to 0.
random.state	`integer, optional` Specifies the seed for random number generator. 0: Uses the current time (in second) as seed; Others: Uses the specified value as seed. Valid only when the trained classification model is for Naive Bayes, Support Vector Machine(SVM), Multilayer Perceptron(MLP) or Multi-class Logistic Regression. Defaults to 0.

Format

S3 methods

Value

Predicted values are returned as a DataFrame, structured as follows.

ID column name
SCORE
CONFIDENCE
REASON CODE

Examples

Input data for prediction:

> df.predict
  ID  OUTLOOK   TEMP HUMIDITY WINDY
1  0 Overcast     75   -10000   Yes
2  1     Rain     78       70   Yes
3  2    Sunny -10000       NA   Yes
4  3    Sunny     69       70   Yes
5  4     Rain     NA       70   Yes
6  5     <NA>     70       70   Yes
7  6      ***     70       70   Yes

Call the predict() function:

> res <- predict(model = uc.dt,
                 data = df.predict,
                 key = "ID",
                 func = "DecisionTree")

Check the result:

> res$Collect()[1:3]
  ID       SCORE CONFIDENCE
1  0        Play  1.0000000
2  1 Do not Play  1.0000000
3  2        Play  0.5000000
4  3        Play  0.5000000
5  4        Play  0.6363636
6  5        Play  0.5000000
7  6        Play  0.5000000

Make Predictions from a "UnifiedClassification" Object

Arguments

Format

Value

Examples

See also