| hanaml.HGBTClassifier {hana.ml.r} | R Documentation |
Hybrid Gradient Boosting model for classification.
hanaml.HGBTClassifier(conn.context,
data = NULL,
key = NULL,
features = NULL,
label = NULL,
formula = NULL,
n.estimators = NULL,
random.state = NULL,
subsample = NULL,
max.depth = NULL,
split.threshold = NULL,
learning.rate = NULL,
split.method = NULL,
sketch.eps = NULL,
fold.num = NULL,
min.sample.weight.leaf = NULL,
min.samples.leaf = NULL,
max.w.in.split = NULL,
col.subsample.split = NULL,
col.subsample.tree = NULL,
lambda = NULL,
alpha = NULL,
evaluation.metric = NULL,
reference.metric = NULL,
parameter.range = NULL,
parameter.values = NULL,
resampling.method = NULL,
repeat.times = NULL,
param.search.strategy = NULL,
random.search.times = NULL,
timeout = NULL,
progress.indicator.id = NULL,
calculate.importance = NULL,
calculate.cm = NULL,
base.score = NULL,
thread.ratio = NULL,
categorical.variable = NULL)
conn.context |
|
data |
|
key |
|
features |
|
label |
|
formula |
|
n.estimators |
|
random.state |
|
subsample |
|
max.depth |
|
split.threshold |
|
learning.rate |
|
split.method |
|
sketch.eps |
|
fold.num |
|
min.sample.weight.leaf |
|
min.samples.leaf |
|
max.w.in.split |
|
col.subsample.split |
|
col.subsample.tree |
|
lambda |
|
alpha |
|
evaluation.metric |
|
reference.metric |
|
parameter.range |
|
parameter.values |
|
resampling.method |
If no value is specified for this parameter,
then no model evaluation or parameter selection will be activated. |
repeat.times |
|
param.search.strategy |
If this parameter is not set, then only model evaluation is activated. |
random.search.times |
|
timeout |
|
progress.indicator.id |
|
calculate.importance |
|
calculate.cm |
|
base.score |
|
thread.ratio |
|
categorical.variable |
|
An object of class R6ClassGenerator of length 24.
An "HGBTClassifier" object with the following attributes:
model DataFrame
ROW_INDEX - model row index
TREE_INDEX - tree index( -1 indicates the global information.)
MODEL_CONTENT - model content
feature.importances DataFrame
VARIABLE_NAME - Independent variable name
IMPORTANCE - Variable importance
confusion.matrix DataFrame
ACTUAL_CLASS - The actual class name
PREDICTED_CLASS - The predicted class name
COUNT - Number of records
stats DataFrame
STAT_NAME - Statistics name
STAT_VALUE - Statistics value
cv DataFrame
PARM_NAME - parameter name
INT_VALUE - integer value
DOUBLE_VALUE - double value
STRING_VALUE - character value
## Not run:
Input DataFrame for training:
> df <- conn.context$table("PAL_TRAIN_HGBT_DATA_TBL")
> data$Collect()
ATT1 ATT2 ATT3 ATT4 LABEL
0 1.0 10.0 100.0 1.0 A
1 1.1 10.1 100.0 1.0 A
2 1.2 10.2 100.0 1.0 A
3 1.3 10.4 100.0 1.0 A
4 1.2 10.3 100.0 1.0 A
5 4.0 40.0 400.0 4.0 B
6 4.1 40.1 400.0 4.0 B
Creating an instance of Hybrid Gradient Boosting classifier and performing the fit :
> ghc = hanaml.HGBTClassifier(conn.context = conn, data = df,
features = c('ATT1', 'ATT2', 'ATT3', 'ATT4'),
label = 'LABEL',
n.estimators = 4, split.threshold = 0,
learning.rate = 0.5, fold.num = 5, max.depth = 6,
evaluation.metric = 'error.rate', reference.metric = c('auc'),
parameter.range = list("learning.rate" = c(0.1, 1.0, 3),
"n.estimators" = c(4, 3, 10),
"split.threshold" = c(0.1, 0.3, 1.0)))
> ghc.stats$Collect()
STAT_NAME STAT_VALUE
0 ERROR_RATE_MEAN 0.133333
1 ERROR_RATE_VAR 0.0266666
2 AUC_MEAN 0.9
Input DataFrame for predict:
> df <- conn.context$table("PAL_TRAIN_HGBT_PREDICT_TBL")
> data$Collect()
ID ATT1 ATT2 ATT3 ATT4
0 1 1.0 10.0 100.0 1.0
1 2 1.1 10.1 100.0 1.0
2 3 1.2 10.2 100.0 1.0
3 4 1.3 10.4 100.0 1.0
4 5 1.2 10.3 100.0 3.0
5 6 4.0 40.0 400.0 3.0
6 7 4.1 40.1 400.0 3.0
7 8 4.2 40.2 400.0 3.0
8 9 4.3 40.4 400.0 3.0
9 10 4.2 40.3 400.0 3.0
## End(Not run)