Similar to other predict methods, this function predicts fitted values from a fitted "HGBTClassifier" object.

# S3 method for HGBTClassifier
predict(
  model,
  data,
  key,
  features = NULL,
  verbose = NULL,
  thread.ratio = NULL,
  missing.replacement = NULL
)

Format

S3 methods

Arguments

model

R6Class
A "HGBTClassifier" object for prediction.

data

DataFrame
DataFrame containting the data.

key

character
Name of the ID column.

features

character of list of characters, optional
Name of feature columns for prediction.
If not provided, it defaults to all non-key columns of data.

verbose

logical, optional
If TRUE, output all classes and the corresponding confidences for each data point.
Defaults to FALSE.

thread.ratio

double, optional
Controls the proportion of available threads that can be used by this function.
The value range is from 0 to 1, where 0 indicates a single thread, and 1 indicates all available threads. Values between 0 and 1 will use up to that percentage of available threads.
Values outside the range from 0 to 1 are ignored, and the actual number of threads used is then be heuristically determined.
Defaults to -1.

missing.replacement

character, optional
The missing replacement strategy:

  • "feature.marginalized": marginalize each missing feature out independently.

  • "instance.marginalized": marginalize all missing features in an instance as a whole corresponding to each category.

Defaults to "feature.marginalized".

Value

Dataframe
Prediction result, structured as follows:

  • ID, integer - ID column, with the same name and type as df's ID column

  • SCORE, NVARCHAR(100) - representing the predicted classes.

  • CONFIDENCE, double - representing the confidence of a class label assignment.

Examples

Performing predict() on given DataFrame:


> df$Collect()
  ID  ATT1   ATT2  ATT3 ATT4
1  1   1.0   10.0   100    1
2  2   1.1   10.1   100    1
3  3   1.2   10.2   100    1
4  4   1.3   10.4   100    1
5  5   1.2   10.3   100    3
6  6   4.0   40.0   400    3
7  7   4.1   40.1   400    3
8  8   4.2   40.2   400    3
9  9   4.3   40.4   400    3
10 10  4.2   40.3   400    3

Call the function:


> result <- predict.HGBTClassifier(ghc, df, key = "ID", verbose = FALSE)
or
> result <- predict(ghc, df, key = "ID", verbose = FALSE)

Output:


> result$Collect()
   ID  SCORE  CONFIDENCE
1   1      A    0.852674
2   2      A    0.852674
3   3      A    0.852674
4   4      A    0.852674
5   5      A    0.751394
6   6      B    0.703119
7   7      B    0.703119
8   8      B    0.703119
9   9      B    0.830549
10 10      B    0.703119