MLPMultiTaskClassifier

class hana_ml.algorithms.pal.neural_network.MLPMultiTaskClassifier(hidden_layer_size=None, activation=None, batch_size=None, num_epochs=None, random_state=None, use_batchnorm=None, learning_rate=None, optimizer=None, dropout_prob=None, training_percentage=None, early_stop=None, normalization=None, warmup_epochs=None, patience=None, save_best_model=None, training_style=None, network_type=None, embedded_num=None, residual_num=None, finetune=None, resampling_method=None, evaluation_metric=None, fold_num=None, repeat_times=None, param_search_strategy=None, random_search_times=None, timeout=None, progress_indicator_id=None, reduction_rate=None, aggressive_elimination=None, param_range=None, param_values=None)

Multi Task MLP Classifier.

Parameters
hidden_layer_sizelist (tuple) of int, optional

Specifies the sizes of all hidden layers in the neural network.

Mandatory and valid only when network_type is 'basic' and finetune is not True.

activationstr, optional

Specifies the activation function for the hidden layer.

Valid activation functions include:

  • 'sigmoid'

  • 'tanh'

  • 'relu'

  • 'leaky-relu'

  • 'elu'

  • 'gelu'

Defaults to 'relu'.

batch_sizeint, optional

Specifies the number of training samples in a batch.

Defaults to 16 (if the input data contains less than 16 samples, the size of input dat is used).

num_epochsint, optional

Specifies the maximum number of training epochs.

Defaults to 100.

random_stateint, optional

Specifies the seed for random generation. Use system time when 0 is specified.

Defaults to 0.

use_batchnormbool, optional

Specifies whether to use batch-normalization in each hidden layer or not.

Defaults to True (i.e. use batch-normalization).

learning_ratefloat, optional

Specifies the learning rate for gradient based optimizers.

Defaults to 0.001.

optimizerstr, optional

Specifies the optimizer for training the neural network.

  • 'sgd'

  • 'rmsprop'

  • 'adam'

  • 'adagrad'

Defaults to 'adam'.

dropout_probfloat, optional

Specifies the dropout probability applied when training the neural network.

Defaults to 0.0 (i.e. no dropout).

training_percentagefloat, optional

Specifies the percentage of input data used for training (with the rest of input data used for valiation).

Defaults to 0.9.

early_stopbool, optional

Specifies whether to use the automatic early stopping method or not.

Defaults to True (i.e. use automatic early stopping)

normalizationstr, optional

Specifies the normalization type for input data.

  • 'no' (no normalization)

  • 'z-transform'

  • 'scalar'

Defaults to 'no'.

warmup_epochsint, optional

Specifies the least number of epochs to wait before executing the auto early stopping method.

Defaults to 5.

patienceint, optional

Specifies the uumber of epochs to wait before terminating the training if no improvement is shown.

Defaults to 5.

save_best_modelbool, optional

Specifies whether to save the best model (regarding to the minimum loss on the validation set).

Defaults to False (i.e. save the model from the last training epoch, not the best one).

training_style{'batch', 'stochastic'}, optional

Specifies the training style of the learning algorithm, either in batch mode or in stochastic mode.

  • 'batch' : This approach uses the entire training dataset to update model parameters, where LBFGS-B optimizer is adopted. This approach can be stable but memory-intensive.

  • 'stochastic'This approach updates parameters with individual samples.

    While potentially less stable, it often leads to better generalization.

Defaults to 'stochastic'.

network_type{'basic', 'resnet'}, optional

Specifies the structure of the underlying neural-network to train. It can be a basic neural-network, or a neural-network comprising of residual blocks, i.e. ResNet.

Defaults to 'basic' (corresponding to basic neural-network).

embedded_numint, optional

Specifies the embedding dimension of ResNet for the input data, which equals to the dimension of the 1st linear in ResNet.

Mandatory and valid when network_type is 'resnet' and finetune is not True.

residual_numint, optional

Specifies the number of residual blocks in ResNet.

Mandatory and valid when network_type is 'resnet' and finetune is not True.

finetunebool, optional

Specifies the task type of the initialized class, i.e. whether it is used to finetune an existing pre-trained model, or trian a new model from scratch given the input data.

Defaults to False.

resampling_methodstr, optional

Specifies the resampling method for model evaluation or parameter selection.

Valid options include: 'cv', 'stratified_cv', 'bootstrap', 'stratified_bootstrap', 'cv_sha', 'stratified_cv_sha', 'bootstrap_sha', 'stratified_bootstrap_sha', 'cv_hyperband', 'stratified_cv_hyperband', 'bootstrap_hyperband', 'stratified_bootstrap_hyperband'.

If no value is specified for this parameter, neither model evaluation nor parameter selection is activated.

No default value.

Note

Resampling method with suffix 'sha' or 'hyperband' is used for parameter selection only, not for model evaluation.

evaluation_metric{'accuracy','f1_score', 'auc_1vsrest', 'auc_pairwise'}, optional

Specifies the evaluation metric for model evaluation or parameter selection.

Must be specified together with resampling_method to activate model evaluation or parameter selection.

No default value.

fold_numint, optional

Specifies the fold number for the cross-validation.

Mandatory and valid only when resampling_method is specified to be one of the following: 'cv', 'stratified_cv', 'cv_sha', 'stratified_cv_sha', 'cv_hyperband', 'stratified_cv_hyperband'.

repeat_timesint, optional

Specifies the number of repeat times for resampling.

Defaults to 1.

param_search_strategy{'grid', 'random'}, optional

Specifies the method for parameter selection.

  • mandatory if resampling_method is specified with suffix 'sha'

  • defaults to 'random' and cannot be changed if resampling_method is specified with suffix 'hyperband'

  • otherwise no default value, and parameter selection will not be activated if not specified

random_search_timesint, optional

Specifies the number of times to randomly select candidate parameters.

Mandatory and valid only when param_search_strategy is set to 'random'.

timeoutint, optional

Specifies maximum running time for model evaluation/parameter selection, in seconds.

No timeout when 0 is specified.

Defaults to 0.

progress_indicator_idstr, optional

Sets an ID of progress indicator for model evaluation/parameter selection.

If not provided, no progress indicator is activated.

param_valuesdict or list of tuples, optional

Specifies the values of following parameters for model parameter selection:

hidden_layer_size, residual_num, embedded_dim, activation, learning_rate, optimizer, dropout_prob.

If input is list of tuples, then each tuple must contain exactly two elements:

  • 1st element is the parameter name(str type),

  • 2nd element is a list of valid values for that parameter.

Otherwise, if input is dict, then for each element, the key must be a parameter name, while value be a list of valid values for that parameter.

A simple example for illustration:

[('learning_rate', [0.1, 0.2, 0.5]), ('hidden_layer_size', [[10, 10], [100]])],

or

dict(learning_rate=[0.1, 0.2, 0.5], hidden_layer_size=[[10, 10], [100]]).

Valid only when resampling_method and param_search_strategy are both specified, and training_style is 'stochastic'.

param_rangedict or list of tuple, optional

Sets the range of the following parameters for model parameter selection:

residual_num, embedded_dim, learning_rate, dropout_prob

If input is a list of tuples, the each tuple should contain exactly two elements:

  • 1st element is the parameter name(str type),

  • 2nd element is a list that specifies the range of that parameter as follows: first value is the start value, second value is the step, and third value is the end value. The step value can be omitted, and will be ignored, if param_search_strategy is set to 'random'.

Otherwise, if input is a dict, then for each element the key should be parameter name, while value specifies the range of that parameter.

Valid only when resampling_method and param_search_strategy are both specified and training_style is 'stochastic'.

reduction_ratefloat, optional

Specifies reduction rate in SHA or Hyperband method.

For each round, the available parameter candidate size will be divided by value of this parameter. Thus valid value for this parameter must be greater than 1.0

Valid only when resampling_method is specified with suffix 'sha' or 'hyperband'(e.g. 'cv_sha', 'stratified_bootstrap_hyperband').

Defaults to 3.0.

aggressive_eliminationbool, optional

Specifies whether to apply aggressive elimination while using SHA method.

Aggressive elimination happens when the data size and parameters size to be searched does not match and there are still bunch of parameters to be searched while data size reaches its upper limits. If aggressive elimination is applied, lower bound of limit of data size will be used multiple times first to reduce number of parameters.

Valid only when resampling_method is specified with suffix 'sha'.

Defaults to False.

Attributes
model_DataFrame

The MLP model.

train_log_DataFrame

Provides training errors among iterations.

stats_DataFrame

Names and values of statistics.

optim_param_DataFrame

Provides optimal parameters selected.

Methods

create_model_state([model, function, ...])

Create PAL model state.

delete_model_state([state])

Delete PAL model state.

fit([data, key, features, label, ...])

Fit function for Multi Task MLP (for classifiation).

predict([data, key, features, verbose, model])

Predict method for Multi Task MLP (for classification).

set_model_state(state)

Set the model state by state information.

Examples

>>> train_data.collect()
   ID          X1          X2          X3         Y1     Y2
0   0         1.0        10.0       100.0          A      1
1   1         1.1        10.1       100.0          A      1
2   2         2.2        20.2        11.0          B      2
3   3         2.3        20.4        12.0          B      2
4   4         2.2        20.3        25.0          B      1
.   .         ...         ...         ...          .      .
>>> mlp = MLPMultiTaskClassifier(hidden_layer_size=[5,5,5],
                                 activation='tanh')
>>> mlp.fit(data=train_data, key='ID',
...         label=['Y1', 'Y2'])
create_model_state(model=None, function=None, pal_funcname='PAL_MLP_MULTI_TASK', state_description=None, force=False)

Create PAL model state.

Parameters
modelDataFrame, optional

Specify the model for AFL state.

Defaults to self.model_.

functionstr, optional

Specify the function in the unified API.

A placeholder parameter, not effective for MultiTask MLP.

pal_funcnameint or str, optional

PAL function name.

Defaults to 'PAL_MLP_MULTI_TASK'.

state_descriptionstr, optional

Description of the state as model container.

Defaults to None.

forcebool, optional

If True it will delete the existing state.

Defaults to False.

delete_model_state(state=None)

Delete PAL model state.

Parameters
stateDataFrame, optional

Specified the state.

Defaults to self.state.

set_model_state(state)

Set the model state by state information.

Parameters
state: DataFrame or dict

If state is DataFrame, it has the following structure:

  • NAME: VARCHAR(100), it mush have STATE_ID, HINT, HOST and PORT.

  • VALUE: VARCHAR(1000), the values according to NAME.

If state is dict, the key must have STATE_ID, HINT, HOST and PORT.

fit(data=None, key=None, features=None, label=None, categorical_variable=None, pre_model=None, model_table_name=None)

Fit function for Multi Task MLP (for classifiation).

Parameters
dataDataFrame

DataFrame containing the data.

Note that if finetune is set as True when class is initialized, then data must be structured the same as the one use for training the model stored in pre_model.

keystr, optional

Name of the ID column.

If key is not provided, then:

  • if data is indexed by a single column, then key defaults to that index column;

  • otherwise, it is assumed that data contains no ID column.

featuresa list of str, optional

Names of the feature columns.

If features is not provided, it defaults to all the non-ID, non-label columns.

labelstr or a list of str, optional

Name of the target columns.

If not provided, it defaults to the last non-ID column.

categorical_variablestr or a list of str, optional

Specifies which INTEGER columns should be treated as categorical, with all other INTEGER columns treated as continuous.

No default value.

pre_modelDataFrame, optional

Specifies the pre-model for online/continued training.

Mandatory and valid only if finetune is set as True when class is initialized.

model_table_namestr, optional

Specifies the name of the model table.

Defaults to None.

Returns
A fitted object of class "MLPMultiTaskClassifier".
predict(data=None, key=None, features=None, verbose=None, model=None)

Predict method for Multi Task MLP (for classification).

Parameters
dataDataFrame

DataFrame containing the data for prediction purpose.

keystr, optional

Name of the ID column.

Mandatory if data is not indexed, or the index of data contains multiple columns.

Defaults to the single index column of data if not provided.

featuresa list of str, optional

Names of the feature columns. If features is not provided, it defaults to all the non-ID, non-label columns.

verbosebool, optional

If True, output scoring probabilities for each class.

Defaults to False.

Returns
DataFrame

Predict result.

Inherited Methods from PALBase

Besides those methods mentioned above, the MLPMultiTaskClassifier class also inherits methods from PALBase class, please refer to PAL Base for more details.