hanaml.AutoARIMA.Rd
hanaml.AutoARIMA is a R wrapper for SAP HANA PAL Auto ARIMA algorithm.
hanaml.AutoARIMA(
data = NULL,
key = NULL,
endog = NULL,
exog = NULL,
seasonal.period = NULL,
seasonality.criterion = NULL,
d = NULL,
kpss.significance.level = NULL,
max.d = NULL,
seasonal.d = NULL,
ch.significance.level = NULL,
max.seasonal.d = NULL,
max.p = NULL,
max.q = NULL,
max.seasonal.p = NULL,
max.seasonal.q = NULL,
information.criterion = NULL,
search.strategy = NULL,
max.order = NULL,
initial.p = NULL,
initial.q = NULL,
initial.seasonal.p = NULL,
initial.seasonal.q = NULL,
guess.states = NULL,
max.search.iterations = NULL,
method = NULL,
allow.linear = NULL,
forecast.method = NULL,
output.fitted = NULL,
thread.ratio = NULL,
background.size = NULL,
massive = FALSE,
group.key = NULL,
group.params = NULL
)
DataFrame
DataFrame containting the data.
character, optional
Name of the ID column.
Defaults to the first column if not provided.
character, optional
The endogenous variable, i.e. time series.
Defaults to the first non-ID column.
list of characters, optional
An optional array of exogenous variables.
Valid only for ARIMAX; cannot be the ID column's name
and the name of endog
column.
Defaults to NULL.
integer, optional
Value of the seasonal period.
Negative: Automatically identify seasonality by means of auto-correlation scheme.
0 or 1: Non-seasonal.
Others: Seasonal period.
Defaults to -1.
double, optional
The criterion of the auto-correlation coefficient for accepting seasonality,
in the range of (0, 1). The larger it is, the less probable a time series is
regarded to be seasonal.
Valid only when seasonal.period
is negative.
Defaults to 0.2.
integer, optional
Order of first-differencing.
Others: Uses the specified value as the first-differencing order.
Negative: Automatically identifies first-differencing order with KPSS test.
Defaults to -1.
double, optional
The significance level for KPSS test. Supported values are 0.01, 0.025, 0.05, and 0.1.
The smaller it is, the larger probable a time series is considered as first-stationary,
that is, the less probable it needs first-differencing.
Valid only when d
is negative.
Defaults to 0.05.
integer, optional
The maximum value of d when KPSS test is applied.
Defaults to 2.
integer, optional
Order of seasonal-differencing.
Negative: Automatically identifies seasonal-differencing order Canova-Hansen test.
Others: Uses the specified value as the seasonal-differencing order.
Defaults to -1.
double, optional
The significance level for Canova-Hansen test. Supported values are 0.01, 0.025,
0.05, 0.1, and 0.2. The smaller it is, the larger probable a time series
is considered seasonal-stationary,that is, the less probable it needs
seasonal-differencing.
Valid only when seasonal.d
is negative.
Defaults to 0.05.
integer, optional
The maximum value of seasonal.d
when Canova-Hansen test is applied.
Defaults to 1.
integer, optional
The maximum value of AR order p.
Defaults to 5.
integer, optional
The maximum value of MA order q.
Defaults to 5.
integer, optional
The maximum value of SAR order P.
Defaults to 2.
integer, optional
The maximum value of SMA order Q.
Defaults to 2.
character, optional
The information criterion for order selection: "aicc", "aic", "bic".
Defaults to "aicc".
character, optional
The search strategy for optimal ARMA model:
"exhaustive" : Exhaustive traverse
"stepwise" : Stepwise traverse.
Defaults to 'stepwise'.
integer, optional
The maximum value of max.p
+ max.q
+ max.seasonal.p
+
max.seasonal.q
. Valid only when search.strategy
is 0.
Defaults to 15.
integer, optional
Initial value of p
. Valid only when search.strategy
is 1.
Defaults to 0.
integer, optional
Initial value of q
. Valid only when search.strategy
is 1.
Defaults to 0.
integer, optional
Initial value of seasonal.p
.
Valid only when search.strategy
is 1.
Defaults to 0.
integer, optional
Initial value of seasonal.q
.
Valid only when search.strategy
is 1.
Defaults to 0.
integer or logical, optional
If employing ACF/PACF to guess initial ARMA models, besides user-defined model:
0 or FALSE: No guess. Besides user-defined model, uses states (2, 2) (1, 1)m, (1, 0) (1, 0)m,
and (0, 1) (0, 1)m meanwhile as starting states.
1 or TRUE: Guesses starting states taking advantage of ACF/PACF.
Valid only when search.strategy
is "stepwise".
Defaults to 1.
integer, optional
The maximum iterations for searching optimal ARMA states.
Valid only when search.strategy
is 1.
Defaults to (max.p
+ 1) * (max.q
+ 1) *
(max.seasonal.p
+ 1) * (max.seasonal.q
+ 1).
{"css", "mle", "css-mle"}, optional
The object function for numeric optimization.
"css":
use the conditional sum of squares.
"mle":
use the maximized likelihood estimation.
"css-mle":
use css to approximate starting values and mle to fit.
Defaults to "css-mle".
integer or logical, optional
Controls whether to check linear model ARMA (0, 0) (0, 0)m.
0 or FALSE: No
1 or TRUE: Yes
Defaults to 1.
{"formula.forecast", "innovations.algorithm"}, optional
Store information for the subsequent forecast method.
"formula.forecast":
compute future series via formula.
"innovations.algorithm":
apply innovations algorithm to compute future
series, which requires more original information to be stored
Defaults to "innovations.algorithm".
logical, optional
Output fitted result and residuals if TRUE.
Defaults to TRUE.
double, optional
Controls the proportion of available threads that can be used by this
function.
The value range is from 0 to 1, where 0 indicates a single thread,
and 1 indicates all available threads.
Values between 0 and 1 will use up to
that percentage of available threads.Values outside this
range are ignored.
Defaults to 0.
integer, optional
Indicates the nubmer of data points used in hanaml.
ARIMA with explainations in the predict.ARIMA function.
If you want to use the ARIMA with explainations,
you must background.size to be larger than 0 or -1(auto mode)
when initializing an hanaml.ARIMA instance the and
then set show.explainer=TRUE in the predict function.
Defaults to NULL(no explainations).
logical, optional
Specifies whether or not to use massive mode.
For parameter setting in massive mode, you could use both
group.params (please see the example below) or the original parameters.
Using original parameters will apply for all groups. However, if you define some parameters of a group,
the value of all original parameter setting will be not applicable to such group.
An example is as follows:
> ad <- hanaml.AutoARIMA(data=df,
massive=TRUE,
background.size=5,
group.key='ID',
group.params=list('Group_1'=list('allow.linear'=FALSE')))
In this example, as 'allow.linear' is set in group.params for Group_1, parameter setting of 'background.size' is not applicable to Group_1. Defaults to FALSE.
character, optional
The column of group key. The data type can be INT or NVARCHAR/VARCHAR.
If data type is INT, only parameters set in the group.params are valid.
This parameter is only valid when massive is TRUE.
Defaults to the first column of data if group.key is not provided.
list, optional
If the massive mode is activated (massive = TRUE),
input data shall be divided into different groups with different parameters applied.
An example is as follows:
> ad <- hanaml.AutoARIMA(data=df,
massive=TRUE,
background.size=5,
group.key='ID',
group.params=list("Group_1"=list("allow.linear"=FALSE)))
Valid only when massive is TRUE and defaults to NULL.
Returns a "hanaml.AutoARIMA" object with the following attributes:
model: DataFrame
Fitted model.
fitted: DataFrame
Predicted dependent variable values for training data.
Set to NULL if the training data has no row IDs.
explainer: DataFrame
The with explainations with decomposition of trend, seasonal, transitory, irregular
and reason code of exogenous variables.
This attributes only returns when setting background.size in the initializing an hanaml.ARIMA instance
and show.explainer=TRUE in the predict function.
error.msg : DataFrame
Error message and only valid if massive is TRUE.
The hanaml.AutoARIMA function identifies the orders of an ARIMA model (p, d, q)(P, D, Q)m, where m is the seasonal period according to some information criterion such as AICc, AIC, and BIC. If order selection succeeds, the function gives the optimal model as in the ARIMATRAIN function.
Input DataFrame data:
> data$Collect()
TIMESTAMP Y
1 1 -24.525
2 2 34.720
3 3 57.325
4 4 10.340
5 5 -12.890
......
Invoke the function:
> autoarima <- hanaml.AutoARIMA(data=data, search.strategy="stepwise")
Output:
> autoarima$fitted
TIMESTAMP FITTED RESIDUALS
1 1 NA NA
2 2 NA NA
3 3 NA NA
4 4 NA NA
5 5 -24.5250000 11.63500000
6 6 37.5839311 1.46106885
7 7 57.9926243 -0.69262431
8 8 8.6228706 -1.88787060
9 9 -20.3259208 0.96092077