hanaml.BSTS.Rd
class for Bayesian structure time-series(BSTS)
hanaml.BSTS(
data = NULL,
key = NULL,
endog = NULL,
exog = NULL,
burn = NULL,
niter = NULL,
seasonal.period = NULL,
expected.model.size = NULL,
seed = NULL
)
DataFrame
Input DataFrame containing the time-series data for BSTS training.
character
The ID column that representing the order of time-series values in data
.
character, optional
The column in data
that holds the endogenous variable(i.e. time-series values)
for BSTS modeling.
Defaults to the 1st non-key column in data
.
character or list of characters, optional
An optional array of exogenous variables.
double, optional
Specifies the ratio of total MCMC draws that are neglected from the beginning.
Ranging from 0 to 1. In other words, only the tail 1-burn
portion of the
total MCMC draw is kept(in the model) for prediction.
Defaults to 0.5.
integer, optional
Specifies the total number of MCMC draws.
Defaults to 1000.
integer, optional
Specifies the value of seasonal period.
Defaults to -1.
integer, optional
Specifies the number of contemporaneous data that are expected to be included in
the model.
Defaults to half of the number of contemporaneous data columns.
A **hanaml.BSTS** object with the following attributes
stats: DataFrame
Decomposed components of the target time-series, structured as follows:
1st column : DATA_NAME, type NVARCHAR or NVARCHAR,
2nd column : INCLUSION_PROB, type DOUBLE,
3rd column : VG_COEFF, type DOUBLE
decompose: DataFrame
For storing the variance information of the training data, structured as follows:
1st column : TIME_STAMP, type INTEGER
2nd column : TREND, type DOUBLE
3rd column : SEASONAL, type DOUBLE,
4th column : REGRESSION, type DOUBLE,
5th column : RANDOM, type DOUBLE
model: DataFrame
DataFrame containing the retained tail MCMC samples in a JSON string, structured as follows:
1st column: ROW_INDEX, type INTEGER
2nd column : MODEL_CONTENT, type NVARCHAR
can be seen as a combination of three Bayesian methods altogether - Kalman filter, spike-and-slab regression and Bayesian model averaging. In particular, samples of model parameters are drawn from its posterior distributions using MCMC.
Input data:
> data
TIME_STAMP TARGET_SERIES FEATURE_01 FEATURE_02 ... FEATURE_07 FEATURE_08 FEATURE_09 FEATURE_10
0 0 2.536 1.488 -0.561 ... 0.300 1.750 0.498 0.073
1 1 0.882 1.100 -0.992 ... 0.180 -0.011 0.264 0.584
2 2 -0.077 1.155 -1.212 ... 0.119 -0.028 0.031 0.448
3 3 0.135 0.530 -1.034 ... 0.727 -0.230 -0.143 -0.269
4 4 0.373 0.698 -1.195 ... 0.598 0.625 -0.219 -1.006
5 5 -0.437 0.441 -1.386 ... -0.199 -0.401 -0.526 -1.124
6 6 -0.556 0.405 -0.844 ... -0.245 -0.976 -0.699 -0.504
7 7 -0.432 -0.016 -1.001 ... -0.871 -1.236 -0.884 -1.254
8 8 -0.460 0.271 -1.234 ... -0.359 -0.555 -0.778 -2.114
9 9 -0.698 -0.357 -1.269 ... -1.116 0.156 -1.182 -2.958
10 10 -0.765 -0.006 -1.326 ... -0.276 0.158 -0.917 -0.939
11 11 -0.833 -0.647 -2.124 ... -0.978 -0.572 -1.158 -1.758
12 12 -0.767 -0.282 -1.615 ... -0.444 -1.992 -0.898 -0.831
13 13 -0.356 -0.503 -1.035 ... -0.397 -0.897 -0.844 -0.425
14 14 -0.496 -0.998 -1.356 ... -0.669 -0.338 -1.145 -1.210
15 15 -0.684 -0.618 -1.060 ... -0.805 -0.373 -1.040 -0.868
16 16 -0.953 -0.547 -1.437 ... -0.504 -0.512 -0.898 -1.441
17 17 -0.869 -0.403 -1.360 ... -0.636 0.065 -1.069 -0.929
18 18 -0.831 -0.691 -1.553 ... -0.626 -0.489 -0.858 -1.033
...
47 47 0.730 -0.282 -1.019 ... -0.511 -1.127 -0.792 -0.368
48 48 -0.181 -0.145 -0.585 ... -0.939 -0.388 -1.062 -0.547
49 49 -0.144 -0.120 -0.496 ... -0.856 -1.313 -1.161 0.150
> bs <- hanaml.BSTS(data = data,
key = "TIME_STAMP",
burn = 0.6, expected.model.size = 2, niter = 2000,
seasonal.period = 12, seed = 1)
> bs$stats
DATA_NAME INCLUSION_PROB AVG_COEFF
0 FEATURE_08 0.48500 0.173861
1 FEATURE_01 0.40250 0.437837
2 FEATURE_07 0.24625 0.189362
3 FEATURE_09 0.23375 0.081339
4 FEATURE_02 0.19750 0.098693
5 FEATURE_04 0.14375 0.130138
6 FEATURE_06 0.14125 0.062544
7 FEATURE_10 0.10375 0.003327
8 FEATURE_03 0.08875 0.009415
9 FEATURE_05 0.08750 0.021849