This algorithm is the online version of Multi-Class Logistic Regression, while the Multi-Class Logistic Regression is offline/batch version. The difference is that during training phase, for the offline/batch version algorithm it requires all training data to be fed into the algorithm in one batch, then it tries its best to output one model to best fit the training data. This infers that the computer must have enough memory to store all data, and can obtain all data in one batch. Online version algorithm applies in scenario that either or all these two assumptions are not right.

hanaml.OnlineMultiLogisticRegression(
  class.label = NULL,
  init.learning.rate = NULL,
  decay = NULL,
  drop.rate = NULL,
  step.boundaries = NULL,
  constant.values = NULL,
  enet.alpha = NULL,
  enet.lambda = NULL,
  shuffle = NULL,
  shuffle.seed = NULL,
  weight.avg = NULL,
  weight.avg.begin = NULL,
  learning.rate.type = NULL,
  general.learning.rate = NULL,
  stair.case = NULL,
  cycle = NULL,
  epsilon = NULL,
  window.size = NULL
)

Arguments

class.label

list of characters
Indicate the class label and should be at least two class labels.

init.learning.rate

double, optional
The initial learning rate for learning rate schedule.
Value should be larger than 0. Only valid when learning.rate.type is "Inverse.time.decay", "Exponential.decay", "Polynomial.decay".

decay

double, optional
Specify the learning rate decay speed for learning rate schedule. Larger value indicates faster decay.
Value should be larger than 0. When learning.rate.type is "exponential.decay", value should be larger than 1. Only valid when learning.rate.type is "Inverse.time.decay", "Exponential.decay", "Polynomial.decay".

drop.rate

integer, optional
Specify the decay frequency. There are apparent effect when stair.case is true.
Value should be larger than 0. Only valid when learning.rate.type is "Inverse.time.decay", "Exponential.decay", "Polynomial.decay".

step.boundaries

character, optional
Specify the step boundaries for regions where step size remains constant. The format of this parameter is a comma separated unsigned integer value. The step value start from 0. The values should be in increasing order. Empty value for this parameter is allowed.
Only valid when learning.rate.type is "Piecewise.constant.decay".

constant.values

character, optional
Specify the constant step size for each region defined by step.boundaries. The format of this parameter is a comma separated double value. There should always be one more value than step.boundaries. Only valid when learning.rate.type is "Piecewise.constant.decay".

enet.alpha

double, optional
Elastic-Net mixing parameter. The valid range is [0, 1]. When it is 0, this means Ridge penalty; When it is 1, it is Lasso penalty. Only valid when enet.lambda is not 0.0.
Defaults to 1.0.

enet.lambda

double, optional
Penalized constant. The value should be larger than or equal to 0.0. The higher the value, the characteronger the regularization. When it equal to 0.0, there is no regularization.
Defaults to 0.0.

shuffle

logical, optonal
logical value indicating whether need to shuffle the row order of observation data. FALSE means keeping original order; TRUE means performing shuffle operation.
Defaults to FALSE.

shuffle.seed

integer, optonal
The seed is used to initialize the random generator to perform shuffle operation. The value of this parameter should be larger than or equal to 0. If need to reproduce the result when performing shuffle operation, please set this value to non-zero.
Only valid when shuffle is TRUE. Defaults to 0.

weight.avg

logical, optonal
logical value indicating whether need to perform average operator over output model. FALSE means directly output model; TRUE means perform average operator over output model. Currently only support Polyak Ruppert Average.
Defaults to FALSE.

weight.avg.begin

integer, optonal
Specify the beginning step counter to perform the average operator over model. The value should be larger than or equal to 0. When current step counter is less than this parameter, just directly output model.Only valid when weight.avg is TRUE.
Defaults to 0.

learning.rate.type

character, optonal
Specify the learning rate type for SGD algorithm. - "Inverse.time.decay" - "Exponential.decay" - "Polynomial.decay" - "Piecewise.constant.decay" - "AdaGrad" - "AdaDelta" - "RMSProp" Defaults to "RMSProp".

general.learning.rate

double, optonal
Specify the general learning rate used in AdaGrad and RMSProp. The value should be larger than 0.
Only valid when learning.rate.type is "AdaGrad", "RMSProp". Defaults to 0.001.

stair.case

logical, optonal
logical value indicate the drop way of step size. FALSE means drop step size smoothly. Only valid when learning.rate.type is "Inverse.time.decay", "Exponential.decay". Defaults to FALSE.

cycle

logical, optonal
indicate whether need to cycle from the start when reaching specified end learning rate. FALSE means do not cycle from the start; TRUE means cycle from the start. Only valid when learning.rate.type is 'Polynomial.decay'.
Defaults to FALSE.

epsilon

double, optonal
This parameter has multiple purposes depending on the learn rate type. The value should be within (0, 1). When used in learn rate type 0 and 1, it represent the smallest allowable step size. When step size reach this value, it will no longer change. When used in learning.rate.type 'Polynomial.decay', it represent the end learn rate. When used in learning.rate.type 'AdaGrad', 'AdaDelta', 'RMSProp', it is used to avoid dividing by 0. Only valid when learning.rate.type is not 'Piecewise.constant.decay'.
Defaults to 1E-8.

window.size

double, optonal
This parameter controls the moving window size of recent steps. The value should be in range (0, 1). Larger value means more steps are kept in track. Only valid when learning.rate.type is 'AdaDelta', 'RMSProp'.
Defaults to 0.9.

Value

A "OnlineMultiLogisticRegression" object with the following attributes:

  • coef: DataFrame
    Coefficient values for multi logisitic regression model.

  • online.result: DataFrame
    Updated online training result.

Methods

fit(data = NULL, key = NULL, features = NULL, formula = NULL, thread.ratio = NULL, progress.indicator.id=NULL) The fit function of an "OnlineMultiLogisticRegression" object. Usage:
OMLR <- hanaml.OnlineMultiLogisticRegression()
OMLR$fit(data, key="ID", features=list("X1","X2")) Arguments: data, DataFrame
Input data. key, character, optional
Name of the ID column.
If not provided, the data is assumed to have no ID column.
No default value. features, character of list of characters, optional
Name of feature columns.
If not provided, it defaults all non-key, non-label columns of data. label, character, optional
Name of the column which specifies the dependent variable.
Defaults to the last column of data if not provided. formula, formula type, optional
Formula to be used for model generation. format = label ~ <feature_list> e.g.: formula = CATEGORY~V1+V2+V3
You can either give the formula, or a feature and label combination, but do not provide both.
Defaults to NULL. thread.ratio, double, optional
Controls the proportion of available threads that can be used by this function.
The value range is from 0 to 1, where 0 indicates a single thread, and 1 indicates all available threads. Values between 0 and 1 will use up to that percentage of available threads. Values outside this range are ignored.
Defaults to -1. progress.indicator.id, character, optional
The ID of progress indicator for model evaluation/parameter selection.
Progress indicator deactivated if no value provided.

Examples

First, initialize an online multi logistic regression instance:


> omlr <- OnlineMultiLogisticRegression(class.label=list("0","1","2"),
                                        enet.lambda=0.01,
                                        enet.alpha=0.2, weight.avg=TRUE,
                                        weight.avg.begin=8, learning.rate.type = "rmsprop",
                                        general.learning.rate=0.1,
                                        window.size=0.9, epsilon = 1e-6)

Four rounds of data:


> df.1$Collect()
         X1        X2    Y
0  1.160456 -0.079584  0.0
1  1.216722 -1.315348  2.0
2  1.018474 -0.600647  1.0
3  0.884580  1.546115  1.0
4  2.432160  0.425895  1.0
5  1.573506 -0.019852  0.0
6  1.285611 -2.004879  1.0
7  0.478364 -1.791279  2.0

> df.2$Collect()
         X1        X2    Y
0 -1.799803  1.225313  1.0
1  0.552956 -2.134007  2.0
2  0.750153 -1.332960  2.0
3  2.024223 -1.406925  2.0
4  1.204173 -1.395284  1.0
5  1.745183  0.647891  0.0
6  1.406053  0.180530  0.0
7  1.880983 -1.627834  2.0

> df.3$Collect()
         X1        X2    Y
0  1.860634 -2.474313  2.0
1  0.710662 -3.317885  2.0
2  1.153588  0.539949  0.0
3  1.297490 -1.811933  2.0
4  2.071784  0.351789  0.0
5  1.552456  0.550787  0.0
6  1.202615 -1.256570  2.0
7 -2.348316  1.384935  1.0

> df.4$Collect()
         X1        X2    Y
0 -2.132380  1.457749  1.0
1  0.549665  0.174078  1.0
2  1.422629  0.815358  0.0
3  1.318544  0.062472  0.0
4  0.501686 -1.286537  1.0
5  1.541711  0.737517  1.0
6  1.709486 -0.036971  0.0
7  1.708367  0.761572  0.0

Round 1, invoke fit() for training the model with df.1:


> omlr$fit(df.1, label='Y', features=list('X1', 'X2'))

Output:


> omlr$coef$Collect
       VARIABLE_NAME CLASSLABEL  COEFFICIENT
0  __PAL_INTERCEPT__          0    -0.245137
1  __PAL_INTERCEPT__          1     0.112396
2  __PAL_INTERCEPT__          2    -0.236284
3                 X1          0    -0.189930
4                 X1          1     0.218920
5                 X1          2    -0.372500
6                 X2          0     0.279547
7                 X2          1     0.458214
8                 X2          2    -0.185378

Round 2, invoke fit() for training the model with df.2:


> omlr$fit(df.2, label='Y', features=list('X1', 'X2'))

Output:


> omlr$coef$Collect
       VARIABLE_NAME CLASSLABEL  COEFFICIENT
0  __PAL_INTERCEPT__          0    -0.359296
1  __PAL_INTERCEPT__          1     0.163218
2  __PAL_INTERCEPT__          2    -0.182423
3                 X1          0    -0.045149
4                 X1          1    -0.046508
5                 X1          2    -0.122690
6                 X2          0     0.420425
7                 X2          1     0.594954
8                 X2          2    -0.451050

Round 3, invoke fit() for training the model with df.3:


> omlr$fit(df.3, label='Y', features=list('X1', 'X2'))

Output:


> omlr$coef$Collect
       VARIABLE_NAME CLASSLABEL  COEFFICIENT
0  __PAL_INTERCEPT__          0    -0.225687
1  __PAL_INTERCEPT__          1     0.031453
2  __PAL_INTERCEPT__          2    -0.173944
3                 X1          0     0.100580
4                 X1          1    -0.208257
5                 X1          2    -0.097395
6                 X2          0     0.628975
7                 X2          1     0.576544
8                 X2          2    -0.582955
 

Round 4, invoke fit() for training the model with df.4:


> omlr$fit(df.4, label='Y', features=list('X1', 'X2'))

Output:


> omlr.coef$Collect
       VARIABLE_NAME CLASSLABEL  COEFFICIENT
0  __PAL_INTERCEPT__          0    -0.204118
1  __PAL_INTERCEPT__          1     0.071965
2  __PAL_INTERCEPT__          2    -0.263698
3                 X1          0     0.239740
4                 X1          1    -0.326290
5                 X1          2    -0.139859
6                 X2          0     0.696389
7                 X2          1     0.590014
8                 X2          2    -0.643752