MLPMultiTaskRegressor

class hana_ml.algorithms.pal.neural_network.MLPMultiTaskRegressor(hidden_layer_size, activation=None, batch_size=None, num_epochs=None, random_state=None, use_batchnorm=None, learning_rate=None, optimizer=None, dropout_prob=None, training_percentage=None, early_stop=None, normalization=None, warmup_epochs=None, patience=None, save_best_model=None)

MLP Multi Task Regressor.

Parameters:
hidden_layer_sizelist (tuple) of int

Specifies the sizes of all hidden layers in the neural network.

activationstr

Specifies the activation function for the hidden layer.

Valid activation functions include:

  • 'sigmoid'

  • 'tanh'

  • 'relu'

  • 'leaky-relu'

  • 'elu'

  • 'gelu'

Defaults to 'relu'.

hidden_layer_size_optionslist of tuples, optional

A list of optional sizes of all hidden layers for parameter selection.

batch_sizeint, optional

Specifies the number of training samples in a batch.

Defaults to 16 (if the input data contains less than 16 samples, the size of input dat is used).

num_epochsint, optional

Specifies the maximum number of training epochs.

Defaults to 100.

random_stateint, optional

Specifies the seed for random generation. Use system time when 0 is specified.

Defaults to 0.

use_batchnormbool, optional

Specifies whether to use batch-normalization in each hidden layer or not.

Defaults to True (i.e. use batch-normalization).

learning_ratefloat, optional

Specifies the learning rate for gradient based optimizers.

Defaults to 0.001.

optimizerstr, optional

Specifies the optimizer for training the neural network.

  • 'sgd'

  • 'rmsprop'

  • 'adam'

  • 'adagrad'

Defaults to 'adam'.

dropout_probfloat, optional

Specifies the dropout probability applied when training the neural network.

Defaults to 0.0 (i.e. no dropout).

training_percentagefloat, optional

Specifies the percentage of input data used for training (with the rest of input data used for valiation).

Defaults to 0.9.

early_stopbool, optional

Specifies whether to use the automatic early stopping method or not.

Defaults to True (i.e. use automatic early stopping)

normalizationstr, optional

Specifies the normalization type for input data.

  • 'no' (no normalization)

  • 'z-transform'

  • 'scalar'

Defaults to 'no'.

warmup_epochsint, optional

Specifies the least number of epochs to wait before executing the auto early stopping method.

Defaults to 5.

patienceint, optional

Specifies the uumber of epochs to wait before terminating the training if no improvement is shown.

Defaults to 5.

save_best_modelbool, optional

Specifies whether to save the best model (regarding to the minimum loss on the validation set).

Defaults to False (i.e. save the model from the last training epoch, not the best one).

Examples

>>> train_data.collect()
   ID          X1          X2          X3         Y1     Y2
0   0         1.0        10.0       100.0         1.0     1
1   1         1.1        10.1       100.0         1.1     1
2   2         2.2        20.2        11.0        10.0     2
3   3         2.3        20.4        12.0        10.1     2
4   4         2.2        20.3        25.0        10.2     1
.   .         ...         ...         ...          .      .
>>> mlp = MLPMultiTaskRegressor(hidden_layer_size=[5,5,5],
...                              activation='leaky-relu')
>>> mlp.fit(data=train_data, key='ID',
...         label=['Y1', 'Y2'])
Attributes:
model_DataFrame

The MLP model.

train_log_DataFrame

Provides training errors among iterations.

stats_DataFrame

Names and values of statistics.

optim_param_DataFrame

Provides optimal parameters selected.

Methods

fit([data, key, features, label, ...])

Fit function for Multi Task MLP (for regression).

predict([data, key, features, model])

Predict metho for Multi Task MLP (for regression).

fit(data=None, key=None, features=None, label=None, categorical_variable=None, pre_model=None)

Fit function for Multi Task MLP (for regression).

Parameters:
dataDataFrame

DataFrame containing the data.

keystr, optional

Name of the ID column.

If key is not provided, then:

  • if data is indexed by a single column, then key defaults to that index column;

  • otherwise, it is assumed that data contains no ID column.

featuresa list of str, optional

Names of the feature columns.

If features is not provided, it defaults to all the non-ID, non-label columns.

labelstr or a list of str, optional

Name of the target columns.

If not provided, it defaults to the last non-ID column.

categorical_variablestr or a list of str, optional

Specifies which INTEGER columns should be treated as categorical, with all other INTEGER columns treated as continuous.

No default value.

pre_modelDataFrame, optional

Specifies the pre-model for training.

Defaults to None (i.e. train the model from scratch).

Returns:
A fitted object of class "MLPMultiTaskRegressor".
predict(data=None, key=None, features=None, model=None)

Predict metho for Multi Task MLP (for regression).

Parameters:
dataDataFrame

DataFrame containing the data.

keystr, optional

Name of the ID column.

Mandatory if data is not indexed, or the index of data contains multiple columns.

Defaults to the single index column of data if not provided.

featuresa list of str, optional

Names of the feature columns. If features is not provided, it defaults to all the non-ID, non-label columns..

Returns:
DataFrame

Predict result.

Inherited Methods from PALBase

Besides those methods mentioned above, the MLPMultiTaskRegressor class also inherits methods from PALBase class, please refer to PAL Base for more details.