stationarity_test

hana_ml.algorithms.pal.tsa.stationarity_test.stationarity_test(data, key=None, endog=None, method=None, mode=None, lag=None, probability=None)

Stationarity means that a time series has a constant mean and constant variance over time. For many time series models, the input data has to be stationary for reasonable analysis.

Parameters
dataDataFrame

Input data which contains at least two columns, one is ID column, the other is raw data.

keystr, optional

The ID (Time stamp) column. ID does not need to be in order, but must be unique and equal sampling. The supported data type is INTEGER.

Defaults to the first column of data if the index column of data is not provided. Otherwise, defaults to the index column of data.

endogstr, optional

The column of series to be tested.

Defaults to the first non-key column.

methodstr, optional

Statistic test that used to determine stationarity. The options are "kpss" and "adf".

Defaults "kpss".

modestr, optional

Type of stationarity to determine. The options are "level", "trend" and "no". Note that option "no" is not applicable to "kpss".

Defaults to "level".

lagint, optional

The lag order to calculate the test statistic.

Default value is "kpss": int(12*(data_length / 100)^0.25" ) and "adf": int(4*(data_length / 100)^(2/9)).

probabilityfloat, optional

The confidence level for confirming stationarity.

Defaults to 0.9.

Returns
DataFrame
Statistics for time series, structured as follows:
  • STATS_NAME: Name of the statistics of the series.

  • STATS_VALUE: Indicates the value of corresponding stats.

Examples

Time series data df:

>>> df.head(3).collect()
       TIME_STAMP  SERIES
0      0           0.0
1      1           1.00
2      2           1586.00

Perform stationarity_test():

>>> stats = stationarity_test(df, endog='SERIES', key='TIME_STAMP',
                              method='kpss', mode='trend', lag=5, probability=0.95)

Outputs:

>>> stats.head(3).collect()
     STATS_NAME     STATS_VALUE
0    stationary     0
1    kpss_stat      0.26801
2    p-value        0.01