MLPRecommender

class hana_ml.algorithms.pal.recommender.MLPRecommender(batch_size=None, num_epochs=None, num_heads=None, embedding_dim=None, use_feature_selection=None, learning_rate=None, mlp1_dropout_prob=None, mlp2_dropout_prob=None, mlp1_hidden_dim=None, mlp2_hidden_dim=None, fs_hidden_dim=None, random_state=None, task=None)

The python interface for an MLP-based recommender system method in PAL. Current implementation only supports binary-classification task.

Parameters:
batch_sizeint, optional

Specifies the number of training samples in a batch. Defaults to 16.

num_epochsint, optional

Specifies the number of training epochs.

Defaults to 1.

num_headsint, optional

Specifies the number of heads used in bilinear interaction aggregation layer.

When the hidden dimension of the MLP(s) is large, we can use multiple heads to reduce the matrix computation time

Defaults to 1(i.e. single head).

embedding_dimint, optional

Specifies the embedding size of each feature vector.

Defaults to 10.

use_feature_selectionint, optional

Specifies whether or not to use feature selection.

  • 0: Not use

  • 1: Use

Defaults to 1

learning_rateint, optional

Specifies the learning rate for batch training.

Defaults to 0.0005.

mlp1_dropout_probfloat, optional

Specifies the dropout probability for MLP1.

Defaults to 0.0.

mlp2_dropout_probfloat, optional

Specifies the dropout probability for MLP2.

Defaults to 0.0.

mlp1_hidden_dimlist of int, optional

Specifies the sizes of hidden-layers for MLP1.

Defaults to [64, 32].

mlp2_hidden_dimlist of int, optional

Specifis the sizes of hidden-layers for MLP2.

Defaults to [64, 32].

fs_hidden_dimlist of int, optional

Specifies the size of the hidden-layers for the feature selection module.

Defaults to [64].

random_stateint, optional

Specifies the seed for (pseudo-)random number generation.

  • 0 : current system time.

  • others : the real seed

Defaults to 0.

task{'classification', 'regression'}, optional

Specifies the task for the recommender system.

It can be either 'classification' or 'regression'.

Defaults to 'classification'.

Examples

>>> data.head(5).collect()
    ID   user   item  daytime  weekday  isweekend  homework  cost  weather  country  city  label
0    0    451   4149     5041     5046       5053      5055  5058     5060     5069  5149      0
1    1     91   3503     5041     5047       5053      5056  5058     5065     5095  5149      0
2    2    168    983     5040     5050       5054      5055  5058     5060     5069  5207      1
3    3    620   1743     5045     5051       5054      5055  5058     5061     5073  5149      0
4    4     46   2692     5040     5049       5054      5055  5058     5060     5086  5211      0

Set up the basic structure of the MLPs and interaction/merging layers, then train an MLP recommender model using the input data illustrated as above:

>>> mr = MLPRecommender(batch_size=10,
                        num_epochs=20,
                        random_seed=2023,
                        num_heads=5,
                        embedding_dim=10,
                        use_feature_selection=True,
                        learning_rate=0.01,
                        mlp1_dropout_prob=0.1,
                        mlp2_dropout_prob=0.1,
                        mlp1_hidden_dim=[256, 128, 96],
                        mlp2_hidden_dim=[256, 128, 96],
                        fs_hidden_dim=[256, 128, 96])
>>> mr.fit(data, key='ID', label='label',
           selected_feature_set1=['user', 'homework'],
           selected_feature_set2=['item', 'daytime'])
>>> mr.train_log_.filter('BATCH=\'epoch average loss\'').head(5).collect()
   EPOCH               BATCH      LOSS
0      0  epoch average loss  0.882001
1      1  epoch average loss  0.676835
2      2  epoch average loss  0.827260
3      3  epoch average loss  0.634558
4      4  epoch average loss  0.228178
Attributes:
model_DataFrame

Model content.

train_log_DataFrame

Logs of the training process.

Methods

fit(data[, key, label, ...])

Given the training dataset, train a MLP recommender model that is mainly consisted of two MLPs , namely MLP1 and MLP2, with biliear fusion and (possibly) feature selection modules.

predict(data[, key])

Predict target values using a trained model.

fit(data, key=None, label=None, selected_feature_set1=None, selected_feature_set2=None)

Given the training dataset, train a MLP recommender model that is mainly consisted of two MLPs , namely MLP1 and MLP2, with biliear fusion and (possibly) feature selection modules.

Parameters:
dataDataFrame

Training data.

keystr, optional

Name of the ID column.

If key is not provided, then:

  • if data is indexed by a single column, then key defaults to that index column;

  • otherwise, it defaults to the 1st column of data.

labelstr, optional

Name of the dependent variable.

Defaults to the last column of data``(excluding the ID column specified by ``key).

selected_feature_set1list of str, optional

Selected features of data that are fed into MLP1.

Defaults to None(equivalent to []).

selected_feature_set2list of str, optional

Selected features of data that are fed into MLP2.

Defaults to None(equivalent to []).

Returns:
A fitted object of class "MLPRecommender".
predict(data, key=None)

Predict target values using a trained model.

Parameters:
dataDataFrame

Input data for prediction.

keystr, optional

Specifies name of the ID column in data.

Defaults to the index of data if data is indexed by a single column, otherwise it must be provided.

Returns:
DataFrame

The prediction result, structured as follows:

  • 1st column: IDs

  • 2nd column: Predicted values

Inherited Methods from PALBase

Besides those methods mentioned above, the MLPRecommender class also inherits methods from PALBase class, please refer to PAL Base for more details.