Similar to other predict methods, this function predicts fitted values from a fitted "UnifiedRegression" object.

# S3 method for UnifiedRegression
predict(
  model,
  data,
  key,
  features = NULL,
  thread.ratio = NULL,
  func = NULL,
  prediction.type = NULL,
  significance.level = NULL,
  handle.missing = NULL,
  block.size = NULL
)

Arguments

model

R6Class
A "UnifiedRegression" object for prediction.

data

DataFrame
DataFrame containting the data.

key

character
Name of the ID column.

features

character of list of characters, optional
Name of feature columns for prediction.
If not provided, it defaults to all non-key columns of data.

thread.ratio

double, optional
Controls the proportion of available threads that can be used by this function.
The value range is from 0 to 1, where 0 indicates a single thread, and 1 indicates all available threads. Values between 0 and 1 will use up to that percentage of available threads.
Values outside the range from 0 to 1 are ignored, and the actual number of threads used is then be heuristically determined.
Defaults to -1.

func

character, optional
The functionality for unified regression model.
Mandatory only when the func attribute of model is NULL.
Valid values are as follows:
"DecisionTree", "RandomDecisionTrees", "HGBT", "LinearRegression", "SVM", "MLP", "PolynomialRegression", "LogarithmicRegression", "ExponentialRegression", "GeometricRegression", "GLM".

prediction.type

character, optinoal
Specifies the prediction type in the result table.

  • "response": direct response (with link applied)

  • "link": linear response (without link)

Valid only for GLM models.
Defaults to "response".

significance.level

numeric, optional
Specifies the significance level for the confidence interval and prediction interval.
Valid only for GLM models when irls solver is applied.
Defaults to 0.05.

handle.missing

character, optional
Specifies the way to handling missing values in data.

  • "skip": Skip rows with missing values

  • "fill_zero": Replace missing values with 0 before prediction

Valid only for GLM models.
Defaults to "fill_zero".

block.size

integer, optional
Specifies the number of data loaded per time during scoring.

  • 0: load all data once

  • Others: the specified number

This parameter is for reducing memory consumption, especially as the predict data is huge,
or it consists of a large number of missing independent variables.
However, you might lose some efficiency. Valid only for Random Decision Trees models.
Defaults to 0.

Format

S3 methods

Value

Predicted values are returned as a DataFrame, structured as follows.

  • ID column name

  • SCORE

  • UPPER_BOUND

  • LOWER_BoUND

  • REASON

Examples

Input data for prediction:

> df.predict
  ID      X1 X2 X3
1  0   1.690  B  1
2  1   0.054  B  2
3  2 980.123  A  2
4  3   1.000  A  1
5  4   0.563  A  1

Call the predict() function:

> res <- predict(model = umlr,
                 data = df.predict,
                 key = "ID")

Check the result:

> res$Collect()
  ID       SCORE UPPER_BOUND LOWER_BOUND REASON
1  0    8.719607          NA          NA   <NA>
2  1    1.416343          NA          NA   <NA>
3  2 3318.371440          NA          NA   <NA>
4  3   -2.050390          NA          NA   <NA>
5  4   -3.533135          NA          NA   <NA>

See also