AutoExponentialSmoothing

class hana_ml.algorithms.pal.tsa.exponential_smoothing.AutoExponentialSmoothing(model_selection=None, forecast_model_name=None, optimizer_time_budget=None, max_iter=None, optimizer_random_seed=None, thread_ratio=None, alpha=None, beta=None, gamma=None, phi=None, forecast_num=None, seasonal_period=None, seasonal=None, initial_method=None, training_ratio=None, damped=None, accuracy_measure=None, seasonality_criterion=None, trend_test_method=None, trend_test_alpha=None, alpha_min=None, beta_min=None, gamma_min=None, phi_min=None, alpha_max=None, beta_max=None, gamma_max=None, phi_max=None, prediction_confidence_1=None, prediction_confidence_2=None, level_start=None, trend_start=None, season_start=None, expost_flag=None)

Auto exponential smoothing (previously named forecast smoothing) is used to calculate optimal parameters of a set of smoothing functions in SAP HANA PAL, including Single Exponential Smoothing, Double Exponential Smoothing, and Triple Exponential Smoothing.

Parameters

model_selectionbool, optional

Specifies whether the algorithms will perform model selection or not.

True: the algorithm will select the best model among Single/Double/Triple/ Damped Double/Damped Triple Exponential Smoothing models.

False: the algorithm will not perform the model selection.

If forecast_model_name is set, the model defined by forecast_model_name will be used.

Defaults to False.

forecast_model_namestr, optional

Name of the statistical model used for calculating the forecast.

'SESM': Single Exponential Smoothing.
'DESM': Double Exponential Smoothing.
'TESM': Triple Exponential Smoothing.

This parameter must be set unless model_selection is set to 1.

optimizer_time_budgetint, optional

Time budget for Nelder-Mead optimization process.

The time unit is second and the value should be larger than zero.

Defaults to 1.

max_iterint, optional

Maximum number of iterations for simulated annealing.

Defaults to 100.

optimizer_random_seedint, optional

Random seed for simulated annealing.

The value should be larger than zero.

Defaults to system time.

thread_ratiofloat, optional

Controls the proportion of available threads to use. The ratio of available threads.

0: single thread.

0~1: percentage.

Others: heuristically determined.

Defaults to 1.0.

alphafloat, optional

Weight for smoothing. Value range: 0 < alpha < 1.

Default value is computed automatically.

betafloat, optional

Weight for the trend component. Value range: 0 <= beta < 1.

If it is not set, the optimized value will be computed automatically.

Only valid when the model is set by user or identified by the algorithm as 'DESM' or 'TESM'.

Value 0 is allowed under TESM model only.

Defaults value is computed automatically.

gammafloat, optional

Weight for the seasonal component. Value range: 0 < gamma < 1. Only valid when the model is set by user or identified by the algorithm as TESM.

Default value is computed automatically.

phifloat, optional

Value of the damped smoothing constant phi (0 < phi < 1). Only valid when the model is set by user or identified by the algorithm as a damped model.

Default value is computed automatically.

forecast_numint, optional

Number of values to be forecast. Defaults to 0.

seasonal_periodint, optional

Length of a seasonal_period (L > 1).

For example, the seasonal_period of quarterly data is 4, and the seasonal_period of monthly data is 12.

Only valid when the model is set by user or identified by the algorithm as 'TESM'.

Default value is computed automatically.

seasonal{'multiplicative', 'additive'}, optional

Specifies the type of model for triple exponential smoothing.

'multiplicative': Multiplicative triple exponential smoothing.

'additive': Additive triple exponential smoothing.

When seasonal is set to 'additive', the default value of initial_method is 1; When seasonal is set to 'multiplicative', the default value of initial_method is 0.

Defaults to 'multiplicative'.

initial_methodint, optional

Initialization method for the trend and seasonal components.

Refer to TripleExponentialSmoothing for detailed information on initialization method.

Only valid when the model is set by user or identified by the algorithm as 'TESM'.

Defaults to 0 or 1.

training_ratiofloat, optional

The ratio of training data to the whole time series.

Assuming the size of time series is N, and the training ratio is r, the first N*r time series is used to train, whereas only the latter N*(1-r) one is used to test.

If this parameter is set to 0.0 or 1.0, or the resulting training data (N*r) is less than 1 or equal to the size of time series, no train-and-test procedure is carried out.

Defaults to 1.0.

dampedint, optional

For DESM:

False: Uses the Holt's linear method.

True: Uses the additive damped trend Holt's linear method.

For TESM:

False: Uses the Holt Winter method.

True: Uses the additive damped seasonal Holt Winter method.

If model_selection is set to 1, the default value will be computed automatically. Otherwise, the default value is False.

accuracy_measurestr, {'mse', 'mape'}, optional

The criterion used for the optimization.

Defaults to 'mse'.

seasonality_criterionfloat, optional

The criterion of the auto-correlation coefficient for accepting seasonality, in the range of (0, 1).

The larger it is, the less probable a time series is regarded to be seasonal.

Only valid when forecast_model_name is 'TESM' or model_selection is set to 1, and seasonal_period is not defined.

Defaults to 0.5.

trend_test_method{'mk', 'difference-sign'}, optional

'mk': Mann-Kendall test.
'difference-sign': Difference-sign test.

Defaults to 'mk'.

trend_test_alphafloat, optional

Tolerance probability for trend test. The value range is (0, 0.5).

Only valid when model_selection is set to 1.

Defaults to 0.05.

alpha_minfloat, optional

Sets the minimum value of alpha.

Only valid when alpha is not defined.

Defaults to 0.0000000001.

beta_minfloat, optional

Sets the minimum value of beta.

Only valid when beta is not defined.

Defaults to 0.0000000001.

gamma_minfloat, optional

Sets the minimum value of gamma.

Only valid when gamma is not defined.

Defaults to 0.0000000001.

phi_minfloat, optional

Sets the minimum value of phi.

Only valid when phi is not defined.

Defaults to 0.0000000001.

alpha_maxfloat, optional

Sets the maximum value of alpha.

Only valid when alpha is not defined.

Defaults to 1.0.

beta_maxfloat, optional

Sets the maximum value of beta.

Only valid when beta is not defined.

Defaults to 1.0.

gamma_maxfloat, optional

Sets the maximum value of gamma.

Only valid when gamma is not defined.

Defaults to 1.0.

phi_maxfloat, optional

Sets the maximum value of phi.

Only valid when phi is not defined.

Defaults to 1.0.

prediction_confidence_1float, optional

Prediction confidence for interval 1.

Only valid when the upper and lower columns are provided in the result table.

Defaults to 0.8.

prediction_confidence_2float, optional

Prediction confidence for interval 2.

Only valid when the upper and lower columns are provided in the result table.

Defaults to is 0.95.

level_startfloat, optional

The initial value for level component S.

If this value is not provided, it will be calculated in the way as described in TripleExponentialSmoothing.

Notice that level_start cannot be zero.

If it is set to zero, 0.0000000001 will be used instead.

trend_startfloat, optional

The initial value for trend component B.

If this value is not provided, it will be calculated in the way as described in TripleExponentialSmoothing.

season_startlist of tuple/float, optional

A list of initial values for seasonal component C. If specified, the list must be of the length specified in seasonal_period, i.e. start values must be provided for a whole seasonal period.

We can simply give out the start values in a list, where the cycle index of each value is determined by its index in the list; or we can give out the start values together with their cycle indices in a list of tuples.

For example, suppose the seasonal period is 4, with starting values \(x_i, 1 \leq i \leq 4\) indexed by their cycle IDs. Then the four season start values can be specified in a list as \([x_1, x_2, x_3, x_4]\), or equivalently in a list of tuples as \([(1, x_1), (2, x_2), (3, x_3), (4, x_4)]\).

If not provided, start values shall be computed by a default scheme.

expost_flagbool, optional

False: Does not output the expost forecast, and just outputs the forecast values.
True: Outputs the expost forecast and the forecast values.

Defaults to True.

Examples

Input Dataframe df for AutoExponentialSmoothing:

>>> df.collect()
TIMESTAMP       Y
        1     362
        2     385
        3     432
        4     341
        5     382
        ......
       21     627
       22     725
       23     854
       24     661

Create AutoExponentialSmoothing instance:

>>> autoExp = time_series.AutoExponentialSmoothing(forecast_model_name='TESM',
                                                   alpha=0.4,
                                                   beta=0.4,
                                                   gamma=0.4,
                                                   seasonal_period=4,
                                                   forecast_num=3,
                                                   seasonal='multiplicative',
                                                   initial_method=1,
                                                   training_ratio=0.75)

Perform fit on the given data:

>>> autoExp.fit(data=df)

Output:

>>> autoExp.forecast_.collect().set_index('TIMESTAMP').head(6)
TIMESTAMP        VALUE   PI1_LOWER    PI1_UPPER   PI2_LOWER    PI2_UPPER
        1   320.018502         NaN          NaN         NaN          NaN
        2   374.225113         NaN          NaN         NaN          NaN
        3   458.649782         NaN          NaN         NaN          NaN
        4   364.376078         NaN          NaN         NaN          NaN
        5   416.009008         NaN          NaN         NaN          NaN

>>> autoExp.stats_.collect().head(4)
              STAT_NAME         STAT_VALUE
                    MSE   467.811415778471
   NUMBER_OF_ITERATIONS                110
SA_NUMBER_OF_ITERATIONS                100
NM_NUMBER_OF_ITERATIONS                 10

Attributes

forecast_DataFrame: Forecast values.
stats_DataFrame: Statistics analysis content.

Methods

`build_report`()	Generate time series report.
`fit_predict`(data[, key, endog])	Fit and predict based on the given time series.
`generate_html_report`([filename])	Display function.
`generate_notebook_iframe_report`()	Display function.

fit_predict(data, key=None, endog=None)

Fit and predict based on the given time series.

Parameters

dataDataFrame

Input data. At least two columns, one is ID column, the other is raw data.

keystr, optional

The ID column.

Defaults to the first column of data if the index column of data is not provided. Otherwise, defaults to the index column of data.

endogstr, optional

The column of series to be fitted and predicted.

Defaults to the first non-ID column.

Returns

DataFrame: Forecast values.

build_report(): Generate time series report.

property fit_hdbprocedure: Returns the generated hdbprocedure for fit.

generate_html_report(filename=None): Display function.

generate_notebook_iframe_report(): Display function.

property predict_hdbprocedure: Returns the generated hdbprocedure for predict.

Inherited Methods from PALBase

Besides those methods mentioned above, the AutoExponentialSmoothing class also inherits methods from PALBase class, please refer to PAL Base for more details.