R: Multi-layer perceptron (MLP) Regressor

hanaml.MLPRegressor {hana.ml.r}

R Documentation

Multi-layer perceptron (MLP) Regressor

Description

hanaml.MLPRegressor is a R wrapper for PAL Multi-layer Perceptron algorithm.

Usage

hanaml.MLPRegressor(conn.context,
                    data = NULL,
                    key = NULL,
                    features = NULL,
                    label = NULL,
                    formula = NULL,
                    hidden.layer.size = NULL,
                    activation = NULL,
                    output.activation = NULL,
                    learning.rate = NULL,
                    momentum = NULL,
                    training.style = NULL,
                    max.iter = NULL,
                    normalization = NULL,
                    weight.init = NULL,
                    thread.ratio = NULL,
                    categorical.variable = NULL,
                    batch.size = NULL,
                    resampling.method = NULL,
                    evaluation.metric = "RMSE",
                    fold.num = NULL,
                    repeat.times = NULL,
                    param.search.strategy = NULL,
                    random.search.times = NULL,
                    seed = NULL,
                    timeout = NULL,
                    progress.indicator.id = NULL,
                    param.range = NULL,
                    param.values = NULL)

Arguments

`conn.context`	`ConnectionContext` The connection to the HANA system.
`data`	`DataFrame` DataFrame containing the data for training the MLP regression model.
`key`	`character, optional` Name of the ID column. If key is not provided, it is assumed that the input has no ID column.
`features`	`LISTOFSTRINGS, optional` Names of the feature columns. If features is not provided, it defaults to all the non-ID and non-label columns.
`label`	`character, optional` Name of the label column, or list of names of multiple label columns. If label is not provided, it defaults to the last column.
`formula`	`formula type, optional` Formula to be used for model generation. format = label~<feature_list> e.g.formula = LABEL~V1+V2+V3 You can either give the formula, or a feature and label combination. Do not provide both.
`activation`	`character, mandatory` Activation function for the hidden layer: `'tanh', 'linear', 'sigmoid-asymmetric', 'sigmoid-symmetric', 'gaussian-asymmetric', 'gaussian-symmetric', 'elliot-asymmetric', 'elliot-symmetric', 'sin-asymmetric', 'sin-symmetric', 'cos-asymmetric', 'cos-symmetric', 'relu'.`
`output.activation`	`character, mandatory` Output Activation function for the hidden layer: `'tanh', 'linear', 'sigmoid-asymmetric', 'sigmoid-symmetric', 'gaussian-asymmetric', 'gaussian-symmetric', 'elliot-asymmetric', 'elliot-symmetric', 'sin-asymmetric', 'sin-symmetric', 'cos-asymmetric', 'cos-symmetric', 'relu'.`
`hidden.layer.size`	`{tuple, numeric}, mandatory` Size of each hidden layer.
`max.iter`	`numeric, optional` Maximum number of iterations. Defaults to 100.
`training.style`	`{"batch", "stochastic"}, optional` Specifies the training style. Defaults to stochastic.
`learning.rate`	`double, optional` Specifies the learning rate. Only valid when training.style is 'stochastic'.
`momentum`	`double, optional` Specifies the momentum for gradient descent update. Only valid when training.style is 'stochastic'.
`batch.size`	`int, optional` Specifies the size of mini batch. Only valid when training.style is stochastic. Defaults to 1.
`normalization`	`{"no", "z-transform", "scalar"}, optional` Defaults to no (no normalization).
`weight.init`	`character, optional` Specifies the weight initial value from below list. `'all-zeros', 'normal', 'uniform', 'variance-scale-normal', 'variance-scale-uniform'` Defaults to all-zeros.
`categorical.variable`	`character or list of characters, optional` Column names in the data table used as category variable. No default value.
`thread.ratio`	`double, optional` Controls the proportion of available threads to use. The value range is from 0 to 1, where 0 indicates a single thread, and 1 indicates up to all available threads. Values between 0 and 1 will use that percentage of available threads. Values outside this range tell PAL to heuristically determine the number of threads to use. Defaults to 0.
`resampling.method`	`character, optional` specifies the resampling values form below list. `'cv', 'stratified_cv', 'bootstrap', 'stratified_bootstrap'` If no value is specified for this parameter, neither model evaluation nor parameter selection is activated.
`evaluation.metric`	`character, optional` Specifies the evaluation metric for model evaluation or parameter selection. Currently the only valid option is `'RMSE'`.
`fold.num`	`numeric, optional` Specifies the fold number for the cross validation method.
`repeat.times`	`numeric, optional` Specifies the number of repeat times for resampling. Defaults to 1.
`param.search.strategy`	`character, optional` Specifies the method to activate parameter selection. values should either be 'GRID' or 'RANDOM'
`random.search.times`	`numeric, optional` Specifies the number of times to randomly select candidate parameters for selection.
`seed`	`numeric, optional` Specifies the seed for random generation. Use system time when 0 is specified.
`timeout`	`numeric, optional` Specifies maximum running time for model evaluation or parameter selection, in seconds. No timeout when 0 is specified.
`progress.indicator.id`	`character, optional` Sets an ID of progress indicator for model evaluation or parameter selection. No progress indicator is active if no value is provided.
`param.values`	`list, optional` Specifies values of the following parameters for parameter selection: action, output.action, hidden.layer.size, learning.rate, momentum, batch.size
`param.range`	`list, optional` Specifies range of the following parameters for parameter selection: learning.rate, momentum, batch.size

Format

An object of class R6ClassGenerator of length 24.

Value

An "MLPRegressor" object with the following attributes:

model: DataFrame

ROW_INDEX - model row index
MODEL_CONTENT - model content

log: DataFrame

ITERATION - iteration Number
ERROR - Mean squared error between predicted values and target values for each iteration

statistics: DataFrame

STAT_NAME - statistics name
STAT_VALUE - values of the statistics

Examples

## Not run: 
> df <- conn.context$table("PAL_TRAIN_MLP_REG_DATA_TBL")
> df$Collect()
    V000  V001 V002  V003  T001  T002  T003
0     1  1.71   AC     0  12.7   2.8  3.06
1    10  1.78   CA     5  12.1   8.0  2.65
2    17  2.36   AA     6  10.1   2.8  3.24
3    12  3.15   AA     2  28.1   5.6  2.24
4     7  1.05   CA     3  19.8   7.1  1.98
5     6  1.50   CA     2  23.2   4.9  2.12
6     9  1.97   CA     6  24.5   4.2  1.05
7     5  1.26   AA     1  13.6   5.1  2.78
8    12  2.13   AC     4  13.2   1.9  1.34
9    18  1.87   AC     6  25.5   3.6  2.14

Training the model:

  > mlpr <- hanaml.MLPRegressor(conn.context = conn, data = df,
                                label= c("T001", "T002", "T003"),
                                hidden.layer.size = c(10,5),
                                activation = "SIN-ASYMMETRIC",
                                output.activation = "SIN-ASYMMETRIC",
                                learning.rate = 0.001, momentum = 0.00001,
                                training.style = "batch",
                                max.iter = 10000, normalization = "z-transform",
                                weight.init = "normal", thread.ratio = 0.3)


Training result may look different from the following results due
to model randomness.


 > mlpr$train.log$Collect()
       ITERATION       ERROR
  0            1   34.525655
  1            2   82.656301
  2            3   67.289241
  3            4  162.768062
  4            5   38.988242
  5            6  142.239468
  6            7   34.467742
  7            8   31.050946
  8            9   30.863581
  9           10   30.078204
  10          11   26.671436
  11          12   28.078312
  12          13   27.243226
  13          14   26.916686
  14          15   26.782915
  15          16   26.724266
  16          17   26.697108
  17          18   26.684084
  18          19   26.677713
  19          20   26.674563
  20          21   26.672997
  21          22   26.672216
  22          23   26.671826
  23          24   26.671631
  24          25   26.671533
  25          26   26.671485
  26          27   26.671460
  27          28   26.671448
  28          29   26.671442
  29          30   26.671439
  ..         ...         ...
  705        706   11.891081
  706        707   11.891081
  707        708   11.891081
  708        709   11.891081
  709        710   11.891081
  710        711   11.891081
  711        712   11.891081
  712        713   11.891081
  713        714   11.891081
  714        715   11.891081
  715        716   11.891081
  716        717   11.891081
  717        718   11.891081
  718        719   11.891081
  719        720   11.891081
  720        721   11.891081
  721        722   11.891081
  722        723   11.891081
  723        724   11.891081
  724        725   11.891081
  725        726   11.891081
  726        727   11.891081
  727        728   11.891081
  728        729   11.891081
  729        730   11.891081
  730        731   11.891081
  731        732   11.891081
  732        733   11.891081
  733        734   11.891081
  734        735   11.891081


## End(Not run)

[Package hana.ml.r version 1.0.8 Index]