MLPRecommender
- class hana_ml.algorithms.pal.recommender.MLPRecommender(batch_size=None, num_epochs=None, num_heads=None, embedding_dim=None, use_feature_selection=None, learning_rate=None, mlp1_dropout_prob=None, mlp2_dropout_prob=None, mlp1_hidden_dim=None, mlp2_hidden_dim=None, fs_hidden_dim=None, random_state=None, task=None)
The python interface for an MLP-based recommender system method in PAL. Current implementation only supports binary-classification task.
- Parameters:
- batch_sizeint, optional
Specifies the number of training samples in a batch. Defaults to 16.
- num_epochsint, optional
Specifies the number of training epochs.
Defaults to 1.
- num_headsint, optional
Specifies the number of heads used in bilinear interaction aggregation layer.
When the hidden dimension of the MLP(s) is large, we can use multiple heads to reduce the matrix computation time
Defaults to 1(i.e. single head).
- embedding_dimint, optional
Specifies the embedding size of each feature vector.
Defaults to 10.
- use_feature_selectionint, optional
Specifies whether or not to use feature selection.
0: Not use
1: Use
Defaults to 1
- learning_rateint, optional
Specifies the learning rate for batch training.
Defaults to 0.0005.
- mlp1_dropout_probfloat, optional
Specifies the dropout probability for MLP1.
Defaults to 0.0.
- mlp2_dropout_probfloat, optional
Specifies the dropout probability for MLP2.
Defaults to 0.0.
- mlp1_hidden_dimlist of int, optional
Specifies the sizes of hidden-layers for MLP1.
Defaults to [64, 32].
- mlp2_hidden_dimlist of int, optional
Specifis the sizes of hidden-layers for MLP2.
Defaults to [64, 32].
- fs_hidden_dimlist of int, optional
Specifies the size of the hidden-layers for the feature selection module.
Defaults to [64].
- random_stateint, optional
Specifies the seed for (pseudo-)random number generation.
0 : current system time.
others : the real seed
Defaults to 0.
- task{'classification', 'regression'}, optional
Specifies the task for the recommender system.
It can be either 'classification' or 'regression'.
Defaults to 'classification'.
Examples
>>> data.head(5).collect() ID user item daytime weekday isweekend homework cost weather country city label 0 0 451 4149 5041 5046 5053 5055 5058 5060 5069 5149 0 1 1 91 3503 5041 5047 5053 5056 5058 5065 5095 5149 0 2 2 168 983 5040 5050 5054 5055 5058 5060 5069 5207 1 3 3 620 1743 5045 5051 5054 5055 5058 5061 5073 5149 0 4 4 46 2692 5040 5049 5054 5055 5058 5060 5086 5211 0
Set up the basic structure of the MLPs and interaction/merging layers, then train an MLP recommender model using the input data illustrated as above:
>>> mr = MLPRecommender(batch_size=10, num_epochs=20, random_seed=2023, num_heads=5, embedding_dim=10, use_feature_selection=True, learning_rate=0.01, mlp1_dropout_prob=0.1, mlp2_dropout_prob=0.1, mlp1_hidden_dim=[256, 128, 96], mlp2_hidden_dim=[256, 128, 96], fs_hidden_dim=[256, 128, 96]) >>> mr.fit(data, key='ID', label='label', selected_feature_set1=['user', 'homework'], selected_feature_set2=['item', 'daytime']) >>> mr.train_log_.filter('BATCH=\'epoch average loss\'').head(5).collect() EPOCH BATCH LOSS 0 0 epoch average loss 0.882001 1 1 epoch average loss 0.676835 2 2 epoch average loss 0.827260 3 3 epoch average loss 0.634558 4 4 epoch average loss 0.228178
- Attributes:
- model_DataFrame
Model content.
- train_log_DataFrame
Logs of the training process.
Methods
fit
(data[, key, label, ...])Given the training dataset, train a MLP recommender model that is mainly consisted of two MLPs , namely MLP1 and MLP2, with biliear fusion and (possibly) feature selection modules.
predict
(data[, key])Predict target values using a trained model.
- fit(data, key=None, label=None, selected_feature_set1=None, selected_feature_set2=None)
Given the training dataset, train a MLP recommender model that is mainly consisted of two MLPs , namely MLP1 and MLP2, with biliear fusion and (possibly) feature selection modules.
- Parameters:
- dataDataFrame
Training data.
- keystr, optional
Name of the ID column.
If
key
is not provided, then:if
data
is indexed by a single column, thenkey
defaults to that index column;otherwise, it defaults to the 1st column of
data
.
- labelstr, optional
Name of the dependent variable.
Defaults to the last column of
data``(excluding the ID column specified by ``key
).- selected_feature_set1list of str, optional
Selected features of
data
that are fed into MLP1.Defaults to None(equivalent to []).
- selected_feature_set2list of str, optional
Selected features of
data
that are fed into MLP2.Defaults to None(equivalent to []).
- Returns:
- A fitted object of class "MLPRecommender".
- predict(data, key=None)
Predict target values using a trained model.
- Parameters:
- dataDataFrame
Input data for prediction.
- keystr, optional
Specifies name of the ID column in
data
.Defaults to the index of
data
ifdata
is indexed by a single column, otherwise it must be provided.
- Returns:
- DataFrame
The prediction result, structured as follows:
1st column: IDs
2nd column: Predicted values
Inherited Methods from PALBase
Besides those methods mentioned above, the MLPRecommender class also inherits methods from PALBase class, please refer to PAL Base for more details.