hanaml.FFMClassifier.Rdhanaml.FFMClassifier is an R wrapper for SAP HANA PAL FFM for classification.
hanaml.FFMClassifier( data = NULL, key = NULL, features = NULL, label = NULL, categorical.variable = NULL, delimiter = NULL, normalize = NULL, include.constant = NULL, include.linear = NULL, early.stop = NULL, factor.num = NULL, train.ratio = NULL, learning.rate = NULL, random.state = NULL, max.iter = NULL, linear.lambda = NULL, poly2.lambda = NULL, sgd.tol = NULL, sgd.exit.interval = NULL, handle.missing = NULL )
| data |
|
|---|---|
| key |
|
| features |
|
| label |
|
| categorical.variable |
VALID only for variables of "INTEGER" type, omitted otherwise. |
| delimiter |
|
| normalize |
|
| include.constant |
|
| include.linear |
|
| early.stop |
|
| factor.num |
|
| train.ratio |
|
| learning.rate |
|
| random.state |
|
| max.iter |
|
| linear.lambda |
|
| poly2.lambda |
|
| sgd.tol |
|
| sgd.exit.interval |
|
| handle.missing |
|
A "FFMClassifier" object with the following attributes:
meta: DataFrame
meta data of the
trained model.
coef: DataFrame
coefficient of the
trained model
stats: DataFrame
statistical information about the
trained model.
FFM has been proven to be a powerful tool for CTR and CVR prediction task.
Based on FM models that reduce weights for sparse higher-order interactions
to vectors using matrix factorization, the Field-Aware Factorization Machine
introduces the concept of field, with which we represent a group of similar
features, e.g., the field of user properties includes gender, age,
occupation, etc.
By making factor vectors related not only to features but
also to fields, the model has to learn a vector representation
for each field.
By doing so, we increase the complexity of the model to O(kn^2) where n is
the number of data, and k is the factor number, i.e., length of the factor
vectors.
In practice, we consider features spanned from the same categorical
variable as of the same field. It is noted that FFM is most suited
to
categorical features. A numeric feature is either regarded as a
single field or
discretized to categorical. If all features are numeric and treated
as every single
feature, which means each field consists of only one feature,
FFM degenerates to FM.
FFM can be applied to a variety of prediction tasks,
for example, binary classification, regression, and ranking.
> data$Head(5) USER MOVIE TIMESTAMP CTR 1 A Movie1 3 Click 2 A Movie2 3 Click 3 A Movie4 1 Not click 4 A Movie5 2 Click 5 A Movie6 3 Click
Call the function:
> FFMClsf <- hanaml.FFMClassifier(data = data, task = "ranking",
categorical.variable = "TIMESTAMP",
delimiter = ",", factor.num = 4,
early.stop = TRUE, learning.rate = 0.2,
max.iter = 20, train.ratio = 0.8,
linear.lambda = 1e-5,
poly2.lambda = 1e-6, random.state = 1)
Output:
> FFMClsf$coefficient$Head(5) COEFF_INDEX FEATURE FIELD K COEFFICIENT 1 0 c <NA> NA -0.03166240 2 1 USER:A <NA> NA -0.13690224 3 2 USER:B <NA> NA -0.04620829 4 3 USER:C <NA> NA 0.10801253 5 4 USER:D <NA> NA -0.04806942