VectorARIMA

class hana_ml.algorithms.pal.tsa.vector_arima.VectorARIMA(order=None, seasonal_order=None, model_type=None, search_method=None, lag_num=None, max_p=None, max_q=None, max_seasonal_p=None, max_seasonal_q=None, max_lag_num=None, init_guess=None, information_criterion=None, include_mean=None, max_iter=None, finite_diff_accuracy=None, displacement=None, ftol=None, gtol=None, calculate_hessian=None, calculate_irf=None, irf_lags=None, alpha=None, output_fitted=None, thread_ratio=None)

Vector Autoregressive Integrated Moving Average ARIMA(p, d, q) model.

Parameters:
order(p, d, q), tuple of int, optional

Indicates the order (p, d, q).

  • p: value of the auto regression order. -1 indicates auto and >=0 is user-defined.

  • d: value of the differentiation order.

  • q: value of the moving average order. -1 indicates auto and >=0 is user-defined.

Defaults to (-1, 0, -1).

seasonal_order(P, D, Q, s), tuple of int, optional

Indicates the seasonal order (P, D, Q, s).

  • P: value of the auto regression order for the seasonal part. -1 indicates auto and >=0 is user-defined.

  • D: value of the differentiation order for the seasonal part.

  • Q: value of the moving average order for the seasonal part. -1 indicates auto and >=0 is user-defined.

  • s: value of the seasonal period. -1 indicates auto and >=0 is user-defined.

Defaults to (-1, 0, -1, 0).

model_type{'VAR', 'VMA', 'VARMA'}, optional

The model type.

Defaults to 'VARMA'.

search_method{'eccm', 'grid_search'}, optional

Specifies the orders of the model. 'eccm' is valid only when seasonal period is less than 1.

Defaults to 'grid_search'.

lag_numint, optional

The lag number of explanatory variables. Valid only when model_type is 'VAR'.

Defaults to 4.

max_pint, optional

The maximum value of vector AR order p.

Defaults to 6 if model_type is 'VAR' or if model_type is 'VARMA' and search_method is 'eccm'.

Defaults to 2 if model_type is 'VARMA' and search_method is 'grid_search'.

max_qint, optional

The maximum value of vector MA order q.

Defaults to 8 if model_type is 'VMA'.

Defaults to 5 if model_type is 'VARMA' and search_method is 'eccm'.

Defaults to 2 if model_type is 'VARMA' and search_method is 'grid_search'.

max_seasonal_pint, optional

The maximum value of seasonal vector AR order P.

Defaults to 3 if model_type is 'VAR'.

Defaults to 1 if model_type is 'VARMA' and search_method is 'grid_search'.

max_seasonal_qint, optional

The maximum value of seasonal vector MA order Q.

Defaults to 1.

max_lag_numint, optional

The maximum lag number of explanatory variables. Valid only when model_type is 'VAR'.

Defaults to 4.

init_guess{'ARMA', 'VAR'}, optional

The model used as initial estimation for VARMA. Valid only for VARMA.

Defaults to 'VAR'.

information_criterion{'AIC', 'BIC'}, optional

Information criteria for order specification.

Defaults to 'AIC'.

include_meanbool, optional

ARIMA model includes a constant part if True.

Valid only when d + D <= 1.

Defaults to True if d + D = 0 else False.

max_iterint, optional

Maximum number of iterations of L-BFGS-B optimizer. Valid only for VMA and VARMA.

Defaults to 200.

finite_diff_accuracyint, optional

Polynomial order of finite difference.

Approximate the gradient of objective function with finite difference.

The valid range is from 1 to 4.

Defaults to 1.

displacementfloat, optional

The step length for finite-difference method.

Valid only for VMA and VARMA.

Defaults to 2.2e-6.

ftolfloat, optional

Tolerance for objective convergence test.

Valid only for VMA and VARMA.

Defaults to 1e-5.

gtolfloat, optional

Tolerance for gradient convergence test.

Valid only for VMA and VARMA.

Defaults to 1e-5.

calculate_hessianbool, optional

Specifies whether to calculate the Hessian matrix.

VMA and VARMA will output standard error of parameter estimates only when calculate_hessian is True.

Defaults to False.

calculate_irfbool, optional

Specifies whether to calculate impulse response function.

Defaults to False.

irf_lagsint, optional

The number of lags of the IRF to be calculated.

Valid only when calculate_irf is True.

Defaults to 8.

alphafloat, optional

Type-I error used in the Ljung-Box tests and eccm.

Defaults to 0.05.

output_fittedbool, optional

Output fitted result and residuals if True.

Defaults to True.

thread_ratiofloat, optional

Controls the proportion of available threads to use.

The ratio of available threads.
  • 0: single thread

  • 0~1: percentage

  • Others: heuristically determined

Defaults to -1.

Examples

Vector ARIMA example:

Input dataframe df:

>>> df.collect()
   TIMESTAMP   Y1        X       Y2
0          1  9.8      6.4      8.2
1          2  9.7      6.4      8.1
2          3  9.8      6.3        8
3          4  9.7      6.2      7.9
4          5  9.6      6.3      7.8
5          6  9.6      6.8      7.6
6          7  9.6      6.8      7.5
7          8    9      6.8      7.5
8          9  9.2      6.8      7.4
9         10  9.2      6.7      7.5
10        11  9.1      6.6      7.6
11        12    9      6.6      7.5
12        13  8.8        6      7.2
13        14  8.8        6      7.7
14        15  8.7      5.9        7
15        16  8.3      5.8      6.5
16        17  8.2      5.9      6.4
17        18  8.2      6.3      6.3
18        19  8.2      6.3      6.1
10        20  8.4      6.4        6
20        21  8.1      6.4      6.1
21        22  7.8      6.5        6
22        23  7.7      6.5      5.9
23        24  7.5      6.3      5.9
24        25  7.2      6.5      5.7
25        26  7.2      6.4      5.8
26        27    7      6.3      5.8
27        28    7        6      5.5
28        29  6.9      6.2      5.4
29        30    7      5.9      5.4
30        31  7.1        6      5.3
31        32  7.4        6      5.4
32        33  6.9      5.8      5.5
33        34  6.8      5.8      5.4
34        35    7      5.6      5.4
35        36  7.1      5.6      5.4
36        37    7      5.3      5.7
37        38    7      5.3      5.6
38        39  7.2      5.4      5.5
39        40  7.6      5.5      5.8

Create an VectorARIMA instance:

>>> varima = VectorARIMA(model_type='VAR', calculate_irf=True)

Perform fit on the given data:

>>> varima.fit(data=df, endog=['Y1', 'Y2'], exog='X')

Expected output:

>>> varima.model_.head(5).collect()
   CONTENT_INDEX                                      CONTENT_VALUE
0              0                                    {"model":"VAR"}
1              1                                 {"exogCols":["X"]}
2              2                          {"endogCols":["Y1","Y2"]}
3              3  {"D":0,"P":0,"c":1,"d":0,"k":2,"m":2,"nT":40,"...
4              4                        {"AIC":-6.6759375491341144}
>>> varima.fitted_.head(3).collect()
  NAMECOL    IDX   FITTING      RESIDUAL
0      Y1      1       NaN           NaN
1      Y1      2       NaN           NaN
2      Y1      3  9.622092      0.177908
>>> varima.irf_.head(3).collect()
  COL1    COL2    IDX   RESPONSE
0   Y1      X       0   0.243569
1   Y1      X       1   0.139749
2   Y1      X       2  -0.351429

Perform predict on the model:

>>> pred_df.collect()
  TIMESTAMP           X
0        41         5.2
1        42         5.2
2        43         5.2
3        44         5.2
4        45         5.7
>>> result_dict, result_all = varima.predict(pred_df)

Expected output:

>>> result_dict['Y1'].head(3).collect()
   IDX  FORECAST          SE        LO95        HI95
0   41  7.577883    0.172352    7.240072    7.915694
1   42  7.202759    0.233421    6.745254    7.660264
2   43  7.074507    0.279358    6.526966    7.622049
3   44  6.856650    0.316641    6.236034    7.477265
4   45  6.773185    0.347997    6.091110    7.455259
>>> result_dict['Y2'].head(3).collect()
   IDX  FORECAST          SE        LO95        HI95
0   41  5.822953    0.171752    5.486320    6.159586
1   42  5.837502    0.216817    5.412541    6.262464
2   43  5.577920    0.249243    5.089403    6.066437
3   44  5.395543    0.275731    4.855109    5.935976
4   45  5.141598    0.298299    4.556933    5.726263
>>> result_all.head(6).collect()
   COLNAME     IDX  FORECAST          SE        LO95        HI95
0       Y1      41  7.577883    0.172352    7.240072    7.915694
1       Y1      42  7.202759    0.233421    6.745254    7.660264
2       Y1      43  7.074507    0.279358    6.526966    7.622049
3       Y1      44  6.856650    0.316641    6.236034    7.477265
4       Y1      45  6.773185    0.347997    6.091110    7.455259
5       Y2      41  5.822953    0.171752    5.486320    6.159586
6       Y2      42  5.837502    0.216817    5.412541    6.262464
7       Y2      43  5.577920    0.249243    5.089403    6.066437
8       Y2      44  5.395543    0.275731    4.855109    5.935976
9       Y2      45  5.141598    0.298299    4.556933    5.726263
Attributes:
model_DataFrame

Model content.

fitted_DateFrame

Fitted values and residuals.

irf_DataFrame

Impulse response function.

Methods

fit(data[, key, endog, exog])

Generates ARIMA models with given parameters.

predict([data, key, forecast_length, ...])

Makes time series forecast based on the estimated ARIMA model.

set_conn(connection_context)

Set connection context for ARIMA and AutoARIMA instance.

fit(data, key=None, endog=None, exog=None)

Generates ARIMA models with given parameters.

Parameters:
dataDataFrame

DataFrame includes key, endogenous variables and may contain exogenous variables.

keystr, optional

The timestamp column of data. The type of key column should be INTEGER, TIMESTAMP, DATE, or SECONDDATE.

Defaults to the first column of data if the index column of data is not provided. Otherwise, defaults to the index of data.

endoglist of str, optional

The endogenous variables, i.e. time series. The type of endog column can be INTEGER, DOUBLE or DECIMAL(p,s).

Defaults to all non-key and non-exog columns of data if not provided.

exoglist of str, optional

An optional array of exogenous variables. The type of exog column can be INTEGER, DOUBLE or DECIMAL(p,s).

Defaults to None.

Returns:
A fitted object of class "VectorARIMA".
set_conn(connection_context)

Set connection context for ARIMA and AutoARIMA instance.

Parameters:
connection_contextConnectionContext

The connection to the SAP HANA system.

Returns:
None.
predict(data=None, key=None, forecast_length=None, allow_new_index=False)

Makes time series forecast based on the estimated ARIMA model.

Parameters:
dataDataFrame, optional

Index and exogenous variables for forecast. The structure is as follows:

  • First column: Index (ID), int.

  • Other columns : exogenous variables, with type INTEGER, DOUBLE or DECIMAL(p,s).

Defaults to None.

keystr, optional

The timestamp column of data. The type of key column is int.

Defaults to the first column of data if the index column of data is not provided. Otherwise, defaults to the index column of data.

forecast_lengthint, optional

Number of points to forecast. Valid only when the first input table is absent.

Defaults to None.

allow_new_indexbool, optional

Indicates whether a new index column is allowed in the result.

  • True: return the result with new integer or timestamp index column.

  • False: return the result with index column starting from 0.

Defaults to False.

Returns:
Dict of DataFrames

Collection of forecasted value. Key is the column name. Forecasted values, structured as follows:

  • ID, type INTEGER, timestamp.

  • FORECAST, type DOUBLE, forecast value.

  • SE, type DOUBLE, standard error.

  • LO95, type DOUBLE, low 95% value.

  • HI95, type DOUBLE, high 95% value.

DataFrame

The aggregated forecasted values. Forecasted values, structured as follows:

  • COLNAME, type NVARCHAR(5000), name of endogs.

  • ID, type INTEGER, timestamp.

  • FORECAST, type DOUBLE, forecast value.

  • SE, type DOUBLE, standard error.

  • LO95, type DOUBLE, low 95% value.

  • HI95, type DOUBLE, high 95% value.

property fit_hdbprocedure

Returns the generated hdbprocedure for fit.

property predict_hdbprocedure

Returns the generated hdbprocedure for predict.

Inherited Methods from PALBase

Besides those methods mentioned above, the VectorARIMA class also inherits methods from PALBase class, please refer to PAL Base for more details.