MLPClassifier
- class hana_ml.algorithms.pal.neural_network.MLPClassifier(activation=None, activation_options=None, output_activation=None, output_activation_options=None, hidden_layer_size=None, hidden_layer_size_options=None, max_iter=None, training_style=None, learning_rate=None, momentum=None, batch_size=None, normalization=None, weight_init=None, categorical_variable=None, resampling_method=None, evaluation_metric=None, fold_num=None, repeat_times=None, search_strategy=None, random_search_times=None, random_state=None, timeout=None, progress_indicator_id=None, param_values=None, param_range=None, thread_ratio=None, reduction_rate=None, aggressive_elimination=None)
Multi-layer perceptron (MLP) Classifier.
- Parameters
- activationstr
Specifies the activation function for the hidden layer.
- Valid activation functions include:
'tanh',
'linear',
'sigmoid_asymmetric',
'sigmoid_symmetric',
'gaussian_asymmetric',
'gaussian_symmetric',
'elliot_asymmetric',
'elliot_symmetric',
'sin_asymmetric',
'sin_symmetric',
'cos_asymmetric',
'cos_symmetric',
'relu'
Should not be specified only if
activation_options
is provided.- activation_optionslist of str, optional
A list of activation functions for parameter selection.
See
activation
for the full set of valid activation functions.- output_activationstr
Specifies the activation function for the output layer.
Valid activation functions same as those in
activation
.Should not be specified only if
output_activation_options
is provided.- output_activation_optionslist of str, optional
A list of activation functions for the output layer for parameter selection.
See
activation
for the full set of activation functions.- hidden_layer_sizelist of int or tuple of int
Sizes of all hidden layers.
Should not be specified only if
hidden_layer_size_options
is provided.- hidden_layer_size_optionslist of tuples, optional
A list of optional sizes of all hidden layers for parameter selection.
- max_iterint, optional
Maximum number of iterations.
Defaults to 100.
- training_style{'batch', 'stochastic'}, optional
Specifies the training style.
Defaults to 'stochastic'.
- learning_ratefloat, optional
Specifies the learning rate. Mandatory and valid only when
training_style
is 'stochastic'.- momentumfloat, optional
Specifies the momentum for gradient descent update. Mandatory and valid only when
training_style
is 'stochastic'.- batch_sizeint, optional
Specifies the size of mini batch. Valid only when
training_style
is 'stochastic'.Defaults to 1.
- normalization{'no', 'z-transform', 'scalar'}, optional
Defaults to 'no'.
- weight_init{'all-zeros', 'normal', 'uniform', 'variance-scale-normal', 'variance-scale-uniform'}, optional
Specifies the weight initial value.
Defaults to 'all-zeros'.
- categorical_variablestr or list of str, optional
Specifies column name(s) in the data table used as category variable.
Valid only when column is of INTEGER type.
- thread_ratiofloat, optional
Controls the proportion of available threads to use for training.
The value range is from 0 to 1, where 0 indicates a single thread, and 1 indicates up to all available threads.
Values between 0 and 1 will use that percentage of available threads.
Values outside this range tell PAL to heuristically determine the number of threads to use.
Defaults to 0.
- resampling_methodstr, optional
Specifies the resampling method for model evaluation or parameter selection.
Valid options include: 'cv', 'stratified_cv', 'bootstrap', 'stratified_bootstrap', 'cv_sha', 'stratified_cv_sha', 'bootstrap_sha', 'stratified_bootstrap_sha', 'cv_hyperband', 'stratified_cv_hyperband', 'bootstrap_hyperband', 'stratified_bootstrap_hyperband'.
If no value is specified for this parameter, neither model evaluation nor parameter selection is activated.
No default value.
Note
Resampling method with suffix 'sha' or 'hyperband' is used for parameter selection only, not for model evaluation.
- evaluation_metric{'accuracy','f1_score', 'auc_onevsrest', 'auc_pairwise'}, optional
Specifies the evaluation metric for model evaluation or parameter selection.
Must be specified together with
resampling_method
to activate model evaluation or parameter selection.No default value.
- fold_numint, optional
Specifies the fold number for the cross-validation.
Mandatory and valid only when
resampling_method
is specified to be one of the following: 'cv', 'stratified_cv', 'cv_sha', 'stratified_cv_sha', 'cv_hyperband', 'stratified_cv_hyperband'.- repeat_timesint, optional
Specifies the number of repeat times for resampling.
Defaults to 1.
- search_strategy{'grid', 'random'}, optional
Specifies the method for parameter selection.
mandatory if
resampling_method
is specified with suffix 'sha'defaults to 'random' and cannot be changed if
resampling_method
is specified with suffix 'hyperband'otherwise no default value, and parameter selection will not be activated if not specified
- random_search_timesint, optional
Specifies the number of times to randomly select candidate parameters.
Mandatory and valid only when
search_strategy
is set to 'random'.- random_stateint, optional
Specifies the seed for random generation.
When 0 is specified, system time is used.
Defaults to 0.
- timeoutint, optional
Specifies maximum running time for model evaluation/parameter selection, in seconds.
No timeout when 0 is specified.
Defaults to 0.
- progress_idstr, optional
Sets an ID of progress indicator for model evaluation/parameter selection.
If not provided, no progress indicator is activated.
- param_valuesdict or list of tuples, optional
Specifies the values of following parameters for model parameter selection:
learning_rate
,momentum
,batch_size
.If input is list of tuples, then each tuple must contain exactly two elements:
1st element is the parameter name(str type),
2nd element is a list of valid values for that parameter.
If input is dict, then for each element, the key must be parameter name, while value be a list of valid values for the corresponding parameter.
A simple example for illustration:
[('learning_rate', [0.1, 0.2, 0.5]), ('momentum', [0.2, 0.6])],
or
dict(learning_rate=[0.1, 0.2, 0.5], momentum=[0.2, 0.6]).
Valid only when
resampling_method
andsearch_strategy
are both specified, andtraining_style
is 'stochastic'.- param_rangelist of tuple, optional
Specifies the range of the following parameters for model parameter selection:
learning_rate
,momentum
,batch_size
.If input is a list of tuples, the each tuple should contain exactly two elements:
1st element is the parameter name(str type),
2nd element is a list that specifies the range of that parameter as follows: first value is the start value, second value is the step, and third value is the end value. The step value can be omitted, and will be ignored, if
search_strategy
is set to 'random'.
Otherwise, if input is a dict, then for each element the key should be parameter name, while value specifies the range of that parameter.
Valid only when
resampling_method
andsearch_strategy
are both specified, andtraining_style
is 'stochastic'.- reduction_ratefloat, optional
Specifies reduction rate in SHA or Hyperband method.
For each round, the available parameter candidate size will be divided by value of this parameter. Thus valid value for this parameter must be greater than 1.0
Valid only when
resampling_method
is specified with suffix 'sha' or 'hyperband'(e.g. 'cv_sha', 'stratified_bootstrap_hyperband').Defaults to 3.0.
- aggressive_eliminationbool, optional
Specifies whether to apply aggressive elimination while using SHA method.
Aggressive elimination happens when the data size and parameters size to be searched does not match and there are still bunch of parameters to be searched while data size reaches its upper limits. If aggressive elimination is applied, lower bound of limit of data size will be used multiple times first to reduce number of parameters.
Valid only when
resampling_method
is specified with suffix 'sha'.Defaults to False.
Examples
Training data:
>>> df.collect() V000 V001 V002 V003 LABEL 0 1 1.71 AC 0 AA 1 10 1.78 CA 5 AB 2 17 2.36 AA 6 AA 3 12 3.15 AA 2 C 4 7 1.05 CA 3 AB 5 6 1.50 CA 2 AB 6 9 1.97 CA 6 C 7 5 1.26 AA 1 AA 8 12 2.13 AC 4 C 9 18 1.87 AC 6 AA
Training the model:
>>> mlpc = MLPClassifier(hidden_layer_size=(10,10), ... activation='tanh', output_activation='tanh', ... learning_rate=0.001, momentum=0.0001, ... training_style='stochastic',max_iter=100, ... normalization='z-transform', weight_init='normal', ... thread_ratio=0.3, categorical_variable='V003') >>> mlpc.fit(data=df)
Training result may look different from the following results due to model randomness.
>>> mlpc.model_.collect() ROW_INDEX MODEL_CONTENT 0 1 {"CurrentVersion":"1.0","DataDictionary":[{"da... 1 2 t":0.2700182926188939},{"from":13,"weight":0.0... 2 3 ht":0.2414416413305134},{"from":21,"weight":0.... >>> mlpc.train_log_.collect() ITERATION ERROR 0 1 1.080261 1 2 1.008358 2 3 0.947069 3 4 0.894585 4 5 0.849411 5 6 0.810309 6 7 0.776256 7 8 0.746413 8 9 0.720093 9 10 0.696737 10 11 0.675886 11 12 0.657166 12 13 0.640270 13 14 0.624943 14 15 0.609432 15 16 0.595204 16 17 0.582101 17 18 0.569990 18 19 0.558757 19 20 0.548305 20 21 0.538553 21 22 0.529429 22 23 0.521457 23 24 0.513893 24 25 0.506704 25 26 0.499861 26 27 0.493338 27 28 0.487111 28 29 0.481159 29 30 0.475462 .. ... ... 70 71 0.349684 71 72 0.347798 72 73 0.345954 73 74 0.344071 74 75 0.342232 75 76 0.340597 76 77 0.338837 77 78 0.337236 78 79 0.335749 79 80 0.334296 80 81 0.332759 81 82 0.331255 82 83 0.329810 83 84 0.328367 84 85 0.326952 85 86 0.325566 86 87 0.324232 87 88 0.322899 88 89 0.321593 89 90 0.320242 90 91 0.318985 91 92 0.317840 92 93 0.316630 93 94 0.315376 94 95 0.314210 95 96 0.313066 96 97 0.312021 97 98 0.310916 98 99 0.309770 99 100 0.308704
Prediction:
>>> pred_df.collect() >>> res, stat = mlpc.predict(data=pred_df, key='ID')
Prediction result may look different from the following results due to model randomness.
>>> res.collect() ID TARGET VALUE 0 1 C 0.472751 1 2 C 0.417681 2 3 C 0.543967 >>> stat.collect() ID CLASS SOFT_MAX 0 1 AA 0.371996 1 1 AB 0.155253 2 1 C 0.472751 3 2 AA 0.357822 4 2 AB 0.224496 5 2 C 0.417681 6 3 AA 0.349813 7 3 AB 0.106220 8 3 C 0.543967
Model Evaluation:
>>> mlpc = MLPClassifier(activation='tanh', ... output_activation='tanh', ... hidden_layer_size=(10,10), ... learning_rate=0.001, ... momentum=0.0001, ... training_style='stochastic', ... max_iter=100, ... normalization='z-transform', ... weight_init='normal', ... resampling_method='cv', ... evaluation_metric='f1_score', ... fold_num=10, ... repeat_times=2, ... random_state=1, ... progress_indicator_id='TEST', ... thread_ratio=0.3) >>> mlpc.fit(data=df, label='LABEL', categorical_variable='V003')
Model evaluation result may look different from the following result due to randomness.
>>> mlpc.stats_.collect() STAT_NAME STAT_VALUE 0 timeout FALSE 1 TEST_1_F1_SCORE 1, 0, 1, 1, 0, 1, 0, 1, 1, 0 2 TEST_2_F1_SCORE 0, 0, 1, 1, 0, 1, 0, 1, 1, 1 3 TEST_F1_SCORE.MEAN 0.6 4 TEST_F1_SCORE.VAR 0.252631 5 EVAL_RESULTS_1 {"candidates":[{"TEST_F1_SCORE":[[1.0,0.0,1.0,... 6 solution status Convergence not reached after maximum number o... 7 ERROR 0.2951168443145714
Parameter selection:
>>> act_opts=['tanh', 'linear', 'sigmoid_asymmetric'] >>> out_act_opts = ['sigmoid_symmetric', 'gaussian_asymmetric', 'gaussian_symmetric'] >>> layer_size_opts = [(10, 10), (5, 5, 5)] >>> mlpc = MLPClassifier(activation_options=act_opts, ... output_activation_options=out_act_opts, ... hidden_layer_size_options=layer_size_opts, ... learning_rate=0.001, ... batch_size=2, ... momentum=0.0001, ... training_style='stochastic', ... max_iter=100, ... normalization='z-transform', ... weight_init='normal', ... resampling_method='stratified_bootstrap', ... evaluation_metric='accuracy', ... search_strategy='grid', ... fold_num=10, ... repeat_times=2, ... random_state=1, ... progress_indicator_id='TEST', ... thread_ratio=0.3) >>> mlpc.fit(data=df, label='LABEL', categorical_variable='V003')
Parameter selection result may look different from the following result due to randomness.
>>> mlpc.stats_.collect() STAT_NAME STAT_VALUE 0 timeout FALSE 1 TEST_1_ACCURACY 0.25 2 TEST_2_ACCURACY 0.666666 3 TEST_ACCURACY.MEAN 0.458333 4 TEST_ACCURACY.VAR 0.0868055 5 EVAL_RESULTS_1 {"candidates":[{"TEST_ACCURACY":[[0.50],[0.0]]... 6 EVAL_RESULTS_2 PUT_LAYER_ACTIVE_FUNC=6;HIDDEN_LAYER_ACTIVE_FU... 7 EVAL_RESULTS_3 FUNC=2;"},{"TEST_ACCURACY":[[0.50],[0.33333333... 8 EVAL_RESULTS_4 rs":"HIDDEN_LAYER_SIZE=10, 10;OUTPUT_LAYER_ACT... 9 ERROR 0.684842661926971 >>> mlpc.optim_param_.collect() PARAM_NAME INT_VALUE DOUBLE_VALUE STRING_VALUE 0 HIDDEN_LAYER_SIZE NaN None 5, 5, 5 1 OUTPUT_LAYER_ACTIVE_FUNC 4.0 None None 2 HIDDEN_LAYER_ACTIVE_FUNC 3.0 None None
- Attributes
- model_DataFrame
Model content.
- train_log_DataFrame
Provides mean squared error between predicted values and target values for each iteration.
- stats_DataFrame
Names and values of statistics.
- optim_param_DataFrame
Provides optimal parameters selected.
Available only when parameter selection is triggered.
Methods
create_model_state
([model, function, ...])Create PAL model state.
delete_model_state
([state])Delete PAL model state.
fit
(data[, key, features, label, ...])Fit the model when the training dataset is given.
predict
(data[, key, features, thread_ratio])Predict using the multi-layer perceptron model.
score
(data[, key, features, label, thread_ratio])Returns the accuracy on the given test data and labels.
set_model_state
(state)Set the model state by state information.
- fit(data, key=None, features=None, label=None, categorical_variable=None)
Fit the model when the training dataset is given.
- Parameters
- dataDataFrame
DataFrame containing the data.
- keystr, optional
Name of the ID column.
If
key
is not provided, then:if
data
is indexed by a single column, thenkey
defaults to that index columnotherwise, it is assumed that
data
contains no ID column
- featureslist of str, optional
Names of the feature columns.
If
features
is not provided, it defaults to all the non-ID and non-label columns.- labelstr, optional
Name of the label column. If
label
is not provided, it defaults to the last column.- categorical_variablestr or list of str, optional
Specifies INTEGER column(s) specified that should be treated as categorical. Other INTEGER columns will be treated as continuous.
- create_model_state(model=None, function=None, pal_funcname='PAL_MULTILAYER_PERCEPTRON', state_description=None, force=False)
Create PAL model state.
- Parameters
- modelDataFrame, optional
Specify the model for AFL state.
Defaults to self.model_.
- functionstr, optional
Specify the function in the unified API.
A placeholder parameter, not effective for Multilayer Perceptron.
- pal_funcnameint or str, optional
PAL function name.
Defaults to 'PAL_MULTILAYER_PERCEPTRON'.
- state_descriptionstr, optional
Description of the state as model container.
Defaults to None.
- forcebool, optional
If True it will delete the existing state.
Defaults to False.
- delete_model_state(state=None)
Delete PAL model state.
- Parameters
- stateDataFrame, optional
Specified the state.
Defaults to self.state.
- property fit_hdbprocedure
Returns the generated hdbprocedure for fit.
- predict(data, key=None, features=None, thread_ratio=None)
Predict using the multi-layer perceptron model.
- Parameters
- dataDataFrame
DataFrame containing the data.
- keystr, optional
Name of the ID column.
Mandatory if
data
is not indexed, or the index ofdata
contains multiple columns.Defaults to the single index column of
data
if not provided.- featureslist of str, optional
Names of the feature columns.
If
features
is not provided, it defaults to all the non-ID columns.- thread_ratiofloat, optional
Controls the proportion of available threads to be used for prediction.
The value range is from 0 to 1, where 0 indicates a single thread, and 1 indicates up to all available threads.
Values between 0 and 1 will use that percentage of available threads.
Values outside this range tell PAL to heuristically determine the number of threads to use.
Defaults to 0.
- Returns
- DataFrame
Predicted classes, structured as follows:
ID column, with the same name and type as
data
's ID column.TARGET, type NVARCHAR, predicted class name.
VALUE, type DOUBLE, softmax value for the predicted class.
Softmax values for all classes, structured as follows:
ID column, with the same name and type as
data
's ID column.CLASS, type NVARCHAR, class name.
VALUE, type DOUBLE, softmax value for that class.
- property predict_hdbprocedure
Returns the generated hdbprocedure for predict.
- set_model_state(state)
Set the model state by state information.
- Parameters
- state: DataFrame or dict
If state is DataFrame, it has the following structure:
NAME: VARCHAR(100), it mush have STATE_ID, HINT, HOST and PORT.
VALUE: VARCHAR(1000), the values according to NAME.
If state is dict, the key must have STATE_ID, HINT, HOST and PORT.
- score(data, key=None, features=None, label=None, thread_ratio=None)
Returns the accuracy on the given test data and labels.
- Parameters
- dataDataFrame
DataFrame containing the data.
- keystr, optional
Name of the ID column.
Mandatory if
data
is not indexed, or the index ofdata
contains multiple columns.Defaults to the single index column of
data
if not provided.- featureslist of str, optional
Names of the feature columns.
If
features
is not provided, it defaults to all the non-ID and non-label columns.- labelstr, optional
Name of the label column.
If
label
is not provided, it defaults to the last column.
- Returns
- float
Scalar value of accuracy after comparing the predicted result and original label.