| hanaml.MLPClassifier {hana.ml.r} | R Documentation |
hanaml.MLPClassifier is a R wrapper for PAL Multi-layer Perceptron algorithm.
hanaml.MLPClassifier(conn.context, data = NULL, key = NULL,
features = NULL, label = NULL,
formula = NULL, hidden.layer.size = NULL,
activation = NULL, output.activation = NULL,
learning.rate = NULL, momentum = NULL,
training.style = NULL, max.iter = NULL,
normalization = NULL, weight.init = NULL,
thread.ratio = NULL, categorical.variable = NULL,
batch.size = NULL, resampling.method = NULL,
evaluation.metric = NULL, fold.num = NULL,
repeat.times = NULL, param.search.strategy = NULL,
random.search.times = NULL, seed = NULL,
timeout = NULL, progress.indicator.id = NULL,
param.range = NULL, param.values = NULL)
conn.context |
|
data |
|
key |
|
features |
|
label |
|
formula |
|
activation |
|
output.activation |
|
hidden.layer.size |
|
max.iter |
|
training.style |
|
learning.rate |
|
momentum |
|
batch.size |
|
normalization |
|
weight.init |
|
categorical.variable |
|
thread.ratio |
|
resampling.method |
|
evaluation.metric |
|
fold.num |
|
repeat.times |
|
param.search.strategy |
|
random.search.times |
|
seed |
|
timeout |
|
progress.indicator.id |
|
param.values |
|
param.range |
|
R6Class object.
An "MLPClassifier" object with the following attributes:
model: DataFrame
ROW_INDEX - model row index
MODEL_CONTENT - model content
log: DataFrame
ITERATION - iteration Number
ERROR - Mean squared error between predicted values
and target values for each iteration
statistics: DataFrame
STAT_NAME - statistics name
STAT_VALUE - values of the statistics
## Not run:
Training data df:
> df <- conn.context$table("PAL_TRAIN_MLP_REG_DATA_TBL")
> df$Collect()
V000 V001 V002 V003 LABEL
0 1 1.71 AC 0 AA
1 10 1.78 CA 5 AB
2 17 2.36 AA 6 AA
3 12 3.15 AA 2 C
4 7 1.05 CA 3 AB
5 6 1.50 CA 2 AB
6 9 1.97 CA 6 C
7 5 1.26 AA 1 AA
8 12 2.13 AC 4 C
9 18 1.87 AC 6 AA
Training the model:
> mlpc <- hanaml.MLPClassifier(conn.context = conn, data = df, key = NULL,
features = NULL, label = NULL,
hidden.layer.size = c(10,10),
activation = "TANH", output.activation ="TANH",
learning.rate = 0.001, momentum = 0.0001,
training.style = "stochastic", max.iter = 100,
normalization = "z-transform", weight.init = "normal",
thread.ratio = 0.3, categorical.variable = "V003")
Training result may look different from the following results due to model randomness.
> mlpc$train.log$Collect()
ITERATION ERROR
0 1 1.080261
1 2 1.008358
2 3 0.947069
3 4 0.894585
4 5 0.849411
5 6 0.810309
6 7 0.776256
7 8 0.746413
8 9 0.720093
9 10 0.696737
10 11 0.675886
11 12 0.657166
12 13 0.640270
13 14 0.624943
14 15 0.609432
.. ... ...
91 92 0.317840
92 93 0.316630
93 94 0.315376
94 95 0.314210
95 96 0.313066
96 97 0.312021
97 98 0.310916
98 99 0.309770
99 100 0.308704
Model evaluation example:
> df <- conn.context$table("PAL_TRAIN_MLP_EVAL_DATA_TBL")
> df$Collect()
V000 V001 V002 V003 LABEL
0 1 1.71 AC 0 AA
1 10 1.78 CA 5 AB
2 17 2.36 AA 6 AA
3 12 3.15 AA 2 C
4 7 1.05 CA 3 AB
5 6 1.50 CA 2 AB
6 9 1.97 CA 6 C
7 5 1.26 AA 1 AA
8 12 2.13 AC 4 C
9 18 1.87 AC 6 AA
Training the model:
> mlpc <- hanaml.MLPClassifier(conn.context, data = df, label= "LABEL",
hidden.layer.size = c(10,10),
activation = "tanh" ,output.activation = "tanh",
learning.rate = 0.001, momentum=0.00001,
training.style = "stochastic",
categorical.variable = "V003", max.iter = 100,
normalization = "z-transform",
weight.init = "normal", thread.ratio = 0.3,
resampling.method = "cv",
evaluation.metric = "f1_score",
fold.num = 10, repeat.times = 2,
seed = 1, progress.indicator.id = "TEST")
Parameter Selection Example:
> df <- conn.context$table("PAL_TRAIN_MLP_EVAL_DATA_TBL")
Training the model
> mlpc <- hanaml.MLPClassifier(conn.context, data = df, label= "LABEL",
learning.rate=0.001, momentum=0.00001,
training.style="stochastic",
categorical.variable = "V003",
max.iter = 100, normalization = "z-transform",
weight.init = "normal", thread.ratio = 0.3,
resampling.method = "stratified_bootstrap",
evaluation.metric = "ACCURACY",
param.search.strategy = "grid",
repeat.times = 2, seed = 1,
progress.indicator.id = "TEST",
param.values = list("hidden.layer.size" =
list(c(10,10), c(5,5,5)),
"activation" =
c("tanh",
"linear",
"sigmoid-asymmetric"),
"output.activation" =
c("sigmoid-symmetric",
"gaussian-asymmetric",
"gaussian-symmetric")))
Optimal Parameters:
0 PARAM_NAME INT_VALUE DOUBLE_VALUE STRING_VALUE
1 HIDDEN_LAYER_SIZE ? ? 10,10
2 OUTPUT_LAYER_ACTIVE_FUNC 4 ? ?
3 HIDDEN_LAYER_ACTIVE_FUNC 1 ? ?
## End(Not run)