predict.HGBTClassifier.Rd
Similar to other predict methods, this function predicts fitted values from a fitted "HGBTClassifier" object.
# S3 method for HGBTClassifier
predict(
model,
data,
key,
features = NULL,
verbose = NULL,
thread.ratio = NULL,
missing.replacement = NULL
)
S3
methods
R6Class
A "HGBTClassifier" object for prediction.
DataFrame
DataFrame containting the data.
character
Name of the ID column.
character of list of characters, optional
Name of feature columns for prediction.
If not provided, it defaults to all non-key columns of data.
logical, optional
If TRUE, output all classes and the corresponding
confidences for each data point.
Defaults to FALSE.
double, optional
Controls the proportion of available threads that can be used by this
function.
The value range is from 0 to 1, where 0 indicates a single thread,
and 1 indicates all available threads. Values between 0 and 1 will use up to
that percentage of available threads.
Values outside the range from 0 to 1 are ignored, and the actual number of threads
used is then be heuristically determined.
Defaults to -1.
character, optional
The missing replacement strategy:
"feature.marginalized": marginalize each missing feature out independently.
"instance.marginalized": marginalize all missing features in an instance as a whole corresponding to each category.
Defaults to "feature.marginalized".
Dataframe
Prediction result, structured as follows:
ID, integer
- ID column, with the same name and type as df's ID column
SCORE, NVARCHAR(100)
- representing the predicted classes.
CONFIDENCE, double
- representing the confidence of
a class label assignment.
Performing predict() on given DataFrame:
> df$Collect()
ID ATT1 ATT2 ATT3 ATT4
1 1 1.0 10.0 100 1
2 2 1.1 10.1 100 1
3 3 1.2 10.2 100 1
4 4 1.3 10.4 100 1
5 5 1.2 10.3 100 3
6 6 4.0 40.0 400 3
7 7 4.1 40.1 400 3
8 8 4.2 40.2 400 3
9 9 4.3 40.4 400 3
10 10 4.2 40.3 400 3
Call the function:
> result <- predict.HGBTClassifier(ghc, df, key = "ID", verbose = FALSE)
or
> result <- predict(ghc, df, key = "ID", verbose = FALSE)
Output:
> result$Collect()
ID SCORE CONFIDENCE
1 1 A 0.852674
2 2 A 0.852674
3 3 A 0.852674
4 4 A 0.852674
5 5 A 0.751394
6 6 B 0.703119
7 7 B 0.703119
8 8 B 0.703119
9 9 B 0.830549
10 10 B 0.703119