hanaml.OnlineMultiLogisticRegression.RdThis algorithm is the online version of Multi-Class Logistic Regression, while the Multi-Class Logistic Regression is offline/batch version. The difference is that during training phase, for the offline/batch version algorithm it requires all training data to be fed into the algorithm in one batch, then it tries its best to output one model to best fit the training data. This infers that the computer must have enough memory to store all data, and can obtain all data in one batch. Online version algorithm applies in scenario that either or all these two assumptions are not right.
hanaml.OnlineMultiLogisticRegression( class.label = NULL, init.learning.rate = NULL, decay = NULL, drop.rate = NULL, step.boundaries = NULL, constant.values = NULL, enet.alpha = NULL, enet.lambda = NULL, shuffle = NULL, shuffle.seed = NULL, weight.avg = NULL, weight.avg.begin = NULL, learning.rate.type = NULL, general.learning.rate = NULL, stair.case = NULL, cycle = NULL, epsilon = NULL, window.size = NULL )
| class.label |
|
|---|---|
| init.learning.rate |
|
| decay |
|
| drop.rate |
|
| step.boundaries |
|
| constant.values |
|
| enet.alpha |
|
| enet.lambda |
|
| shuffle |
|
| shuffle.seed |
|
| weight.avg |
|
| weight.avg.begin |
|
| learning.rate.type |
|
| general.learning.rate |
|
| stair.case |
|
| cycle |
|
| epsilon |
|
| window.size |
|
A "OnlineMultiLogisticRegression" object with the following attributes:
coef: DataFrame
Coefficient values for multi logisitic regression model.
online.result: DataFrame
Updated online training result.
data, DataFramekey, character, optionalfeatures, character of list of characters, optionallabel, character, optionalformula, formula type, optionalthread.ratio, double, optionalprogress.indicator.id, character, optionalFirst, initialize an online multi logistic regression instance:
> omlr <- OnlineMultiLogisticRegression(class.label=list("0","1","2"),
enet.lambda=0.01,
enet.alpha=0.2, weight.avg=TRUE,
weight.avg.begin=8, learning.rate.type = "rmsprop",
general.learning.rate=0.1,
window.size=0.9, epsilon = 1e-6)
Four rounds of data:
> df.1$Collect()
X1 X2 Y
0 1.160456 -0.079584 0.0
1 1.216722 -1.315348 2.0
2 1.018474 -0.600647 1.0
3 0.884580 1.546115 1.0
4 2.432160 0.425895 1.0
5 1.573506 -0.019852 0.0
6 1.285611 -2.004879 1.0
7 0.478364 -1.791279 2.0
> df.2$Collect()
X1 X2 Y
0 -1.799803 1.225313 1.0
1 0.552956 -2.134007 2.0
2 0.750153 -1.332960 2.0
3 2.024223 -1.406925 2.0
4 1.204173 -1.395284 1.0
5 1.745183 0.647891 0.0
6 1.406053 0.180530 0.0
7 1.880983 -1.627834 2.0
> df.3$Collect()
X1 X2 Y
0 1.860634 -2.474313 2.0
1 0.710662 -3.317885 2.0
2 1.153588 0.539949 0.0
3 1.297490 -1.811933 2.0
4 2.071784 0.351789 0.0
5 1.552456 0.550787 0.0
6 1.202615 -1.256570 2.0
7 -2.348316 1.384935 1.0
> df.4$Collect()
X1 X2 Y
0 -2.132380 1.457749 1.0
1 0.549665 0.174078 1.0
2 1.422629 0.815358 0.0
3 1.318544 0.062472 0.0
4 0.501686 -1.286537 1.0
5 1.541711 0.737517 1.0
6 1.709486 -0.036971 0.0
7 1.708367 0.761572 0.0
Round 1, invoke fit() for training the model with df.1:
> omlr$fit(df.1, label='Y', features=list('X1', 'X2'))
Output:
> omlr$coef$Collect
VARIABLE_NAME CLASSLABEL COEFFICIENT
0 __PAL_INTERCEPT__ 0 -0.245137
1 __PAL_INTERCEPT__ 1 0.112396
2 __PAL_INTERCEPT__ 2 -0.236284
3 X1 0 -0.189930
4 X1 1 0.218920
5 X1 2 -0.372500
6 X2 0 0.279547
7 X2 1 0.458214
8 X2 2 -0.185378
Round 2, invoke fit() for training the model with df.2:
> omlr$fit(df.2, label='Y', features=list('X1', 'X2'))
Output:
> omlr$coef$Collect
VARIABLE_NAME CLASSLABEL COEFFICIENT
0 __PAL_INTERCEPT__ 0 -0.359296
1 __PAL_INTERCEPT__ 1 0.163218
2 __PAL_INTERCEPT__ 2 -0.182423
3 X1 0 -0.045149
4 X1 1 -0.046508
5 X1 2 -0.122690
6 X2 0 0.420425
7 X2 1 0.594954
8 X2 2 -0.451050
Round 3, invoke fit() for training the model with df.3:
> omlr$fit(df.3, label='Y', features=list('X1', 'X2'))
Output:
> omlr$coef$Collect
VARIABLE_NAME CLASSLABEL COEFFICIENT
0 __PAL_INTERCEPT__ 0 -0.225687
1 __PAL_INTERCEPT__ 1 0.031453
2 __PAL_INTERCEPT__ 2 -0.173944
3 X1 0 0.100580
4 X1 1 -0.208257
5 X1 2 -0.097395
6 X2 0 0.628975
7 X2 1 0.576544
8 X2 2 -0.582955
Round 4, invoke fit() for training the model with df.4:
> omlr$fit(df.4, label='Y', features=list('X1', 'X2'))
Output:
> omlr.coef$Collect
VARIABLE_NAME CLASSLABEL COEFFICIENT
0 __PAL_INTERCEPT__ 0 -0.204118
1 __PAL_INTERCEPT__ 1 0.071965
2 __PAL_INTERCEPT__ 2 -0.263698
3 X1 0 0.239740
4 X1 1 -0.326290
5 X1 2 -0.139859
6 X2 0 0.696389
7 X2 1 0.590014
8 X2 2 -0.643752