MLPMultiTaskRegressor
- class hana_ml.algorithms.pal.neural_network.MLPMultiTaskRegressor(hidden_layer_size, activation=None, batch_size=None, num_epochs=None, random_state=None, use_batchnorm=None, learning_rate=None, optimizer=None, dropout_prob=None, training_percentage=None, early_stop=None, normalization=None, warmup_epochs=None, patience=None, save_best_model=None, training_style=None, network_type=None, embedded_num=None, residual_num=None, finetune=None)
MLP Multi Task Regressor.
- Parameters:
- hidden_layer_sizelist (tuple) of int, optional
Specifies the sizes of all hidden layers in the neural network.
Mandatory and valid only when
network_typeis 'basic' andfinetuneis not True.- activationstr, optional
Specifies the activation function for the hidden layer.
Valid activation functions include:
'sigmoid'
'tanh'
'relu'
'leaky-relu'
'elu'
'gelu'
Defaults to 'relu'.
- batch_sizeint, optional
Specifies the number of training samples in a batch.
Defaults to 16 (if the input data contains less than 16 samples, the size of input dat is used).
- num_epochsint, optional
Specifies the maximum number of training epochs.
Defaults to 100.
- random_stateint, optional
Specifies the seed for random generation. Use system time when 0 is specified.
Defaults to 0.
- use_batchnormbool, optional
Specifies whether to use batch-normalization in each hidden layer or not.
Defaults to True (i.e. use batch-normalization).
- learning_ratefloat, optional
Specifies the learning rate for gradient based optimizers.
Defaults to 0.001.
- optimizerstr, optional
Specifies the optimizer for training the neural network.
'sgd'
'rmsprop'
'adam'
'adagrad'
Defaults to 'adam'.
- dropout_probfloat, optional
Specifies the dropout probability applied when training the neural network.
Defaults to 0.0 (i.e. no dropout).
- training_percentagefloat, optional
Specifies the percentage of input data used for training (with the rest of input data used for valiation).
Defaults to 0.9.
- early_stopbool, optional
Specifies whether to use the automatic early stopping method or not.
Defaults to True (i.e. use automatic early stopping)
- normalizationstr, optional
Specifies the normalization type for input data.
'no' (no normalization)
'z-transform'
'scalar'
Defaults to 'no'.
- warmup_epochsint, optional
Specifies the least number of epochs to wait before executing the auto early stopping method.
Defaults to 5.
- patienceint, optional
Specifies the uumber of epochs to wait before terminating the training if no improvement is shown.
Defaults to 5.
- save_best_modelbool, optional
Specifies whether to save the best model (regarding to the minimum loss on the validation set).
Defaults to False (i.e. save the model from the last training epoch, not the best one).
- training_style{'batch', 'stochastic'}, optional
Specifies the training style of the learning algorithm, either in batch mode or in stochastic mode.
'batch' : This approach uses the entire training dataset to update model parameters, where LBFGS-B optimizer is adopted. It can be stable but memory-intensive.
- 'stochastic'This approach updates parameters with individual samples based on gradient descent.
While potentially less stable, it often leads to better generalization.
Defaults to 'stochastic'.
- network_type{'basic', 'resnet'}, optional
Specifies the structure of the underlying neural-network to train. It can be a basic neural-network, or a neural-network comprising of residual blocks, i.e. ResNet.
Defaults to 'basic' (corresponding to basic neural-network).
- embedded_numint, optional
Specifies the embedding dimension of ResNet for the input data, which equals to the dimension of the 1st linear in ResNet.
Mandatory and valid when
network_typeis 'resnet' andfinetuneis not True.- resudual_numint, optional
Specifies the number of residual blocks in ResNet.
Mandatory and valid when
network_typeis 'resnet' andfinetuneis not True.- finetunebool, optional
Specifies the task type of the initialized class, i.e. whether it is used to finetune an existing pre-trained model, or trian a new model from scratch given the input data.
Defaults to False.
Examples
>>> train_data.collect() ID X1 X2 X3 Y1 Y2 0 0 1.0 10.0 100.0 1.0 1 1 1 1.1 10.1 100.0 1.1 1 2 2 2.2 20.2 11.0 10.0 2 3 3 2.3 20.4 12.0 10.1 2 4 4 2.2 20.3 25.0 10.2 1 . . ... ... ... . .
>>> mlp = MLPMultiTaskRegressor(hidden_layer_size=[5,5,5], ... activation='leaky-relu') >>> mlp.fit(data=train_data, key='ID', ... label=['Y1', 'Y2'])
- Attributes:
- model_DataFrame
The MLP model.
- train_log_DataFrame
Provides training errors among iterations.
- stats_DataFrame
Names and values of statistics.
- optim_param_DataFrame
Provides optimal parameters selected.
Methods
create_model_state([model, function, ...])Create PAL model state.
delete_model_state([state])Delete PAL model state.
fit([data, key, features, label, ...])Fit function for Multi Task MLP (for regression).
predict([data, key, features, model])Predict metho for Multi Task MLP (for regression).
set_model_state(state)Set the model state by state information.
- create_model_state(model=None, function=None, pal_funcname='PAL_MLP_MULTI_TASK', state_description=None, force=False)
Create PAL model state.
- Parameters:
- modelDataFrame, optional
Specify the model for AFL state.
Defaults to self.model_.
- functionstr, optional
Specify the function in the unified API.
A placeholder parameter, not effective for MultiTask MLP.
- pal_funcnameint or str, optional
PAL function name.
Defaults to 'PAL_MLP_MULTI_TASK'.
- state_descriptionstr, optional
Description of the state as model container.
Defaults to None.
- forcebool, optional
If True it will delete the existing state.
Defaults to False.
- delete_model_state(state=None)
Delete PAL model state.
- Parameters:
- stateDataFrame, optional
Specified the state.
Defaults to self.state.
- set_model_state(state)
Set the model state by state information.
- Parameters:
- state: DataFrame or dict
If state is DataFrame, it has the following structure:
NAME: VARCHAR(100), it mush have STATE_ID, HINT, HOST and PORT.
VALUE: VARCHAR(1000), the values according to NAME.
If state is dict, the key must have STATE_ID, HINT, HOST and PORT.
- fit(data=None, key=None, features=None, label=None, categorical_variable=None, pre_model=None)
Fit function for Multi Task MLP (for regression).
- Parameters:
- dataDataFrame
DataFrame containing the data.
Note that if
finetuneis set as True when class is initialized, thendatamust be structured the same as the one used for training the model stored inpre_model.- keystr, optional
Name of the ID column.
If
keyis not provided, then:if
datais indexed by a single column, thenkeydefaults to that index column;otherwise, it is assumed that
datacontains no ID column.
- featuresa list of str, optional
Names of the feature columns.
If
featuresis not provided, it defaults to all the non-ID, non-label columns.- labelstr or a list of str, optional
Name of the target columns.
If not provided, it defaults to the last non-ID column.
- categorical_variablestr or a list of str, optional
Specifies which INTEGER columns should be treated as categorical, with all other INTEGER columns treated as continuous.
No default value.
- pre_modelDataFrame, optional
Specifies the pre-model for online/continued training.
Mandatory and valid only if
finetuneis set as True when class is initialized.
- Returns:
- A fitted object of class "MLPMultiTaskRegressor".
- predict(data=None, key=None, features=None, model=None)
Predict metho for Multi Task MLP (for regression).
- Parameters:
- dataDataFrame
DataFrame containing the data.
- keystr, optional
Name of the ID column.
Mandatory if
datais not indexed, or the index ofdatacontains multiple columns.Defaults to the single index column of
dataif not provided.- featuresa list of str, optional
Names of the feature columns. If
featuresis not provided, it defaults to all the non-ID, non-label columns..
- Returns:
- DataFrame
Predict result.
Inherited Methods from PALBase
Besides those methods mentioned above, the MLPMultiTaskRegressor class also inherits methods from PALBase class, please refer to PAL Base for more details.