intermittent_forecast

hana_ml.algorithms.pal.tsa.intermittent_forecast.intermittent_forecast(data, key=None, endog=None, p=None, q=None, forecast_num=None, optimizer=None, method=None, grid_size=None, optimize_step=None, accuracy_measure=None, ignore_zero=None, expost_flag=None, thread_ratio=None, iter_count=None, random_state=None, penalty=None)

Intermittent Time Series Forecast (ITSF) is a forecast strategy for products with intermittent demand.

Difference to constant weight of the croston method:

  • ITSF provides a exponential weight to estimate, which means the closer the data, the greater the weight.

  • ITSF does not need the initial value of non-zero demands and time interval between non-zero demands.

Parameters:
dataDataFrame

Data that contains the time-series analysis.

keystr, optional

Specifies the ID(representing time-order) column of data.

Required if a single ID column cannot be inferred from the index of data.

If there is a single column name in the index of data, then key defaults to that column; otherwise key is mandatory.

endogstr, optional

Specifies the name of the column for intermittent demand values.

Defaults to the 1st non-key column of data.

pint, optional

The smoothing parameter for demand, where:

  • -1 : optimizing this parameter automatically

  • positive integers : the specified value for smoothing ([1,n]), forecast by manually specifying value.

The specified value cannot exceed the length of time-series for analysis.

Defaults to -1.

qint, optional

The smoothing parameter for the time-intervals between intermittent demands, where:

  • -1 : optimizing this parameter automatically

  • Non-negative values ([1,P]): forecast by manually specifying value.

Defaults to -1.

forecast_numint, optional

Forecast length. When it is set to 1, the algorithm only forecasts one value.

Defaults to 1.

optimizer{'lbfgsb', 'brute', 'sim_annealing'}, optional

Specifies the optimization algorithm for automatically identifying parameters p and q.

  • 'lbfgsb' : Bounded Limited-memory Broyden-Fletcher-Goldfarb-Shanno(LBFGSB) method with parameters p and q initialized by default scheme.

  • 'brute' : Brute method, LBFGSB with parameter p and q initialized by grid search.

  • 'sim_annealing' : Simulated annealing method.

Defaults to 'lbfgsb'.

methodstr, optional

Specifies the method(or mode) for the output:

  • 'sporadic': Use the sporadic method.

  • 'constant': Use the constant method.

Defaults to 'constant'.

grid_sizeint, optional

Specifies the number of steps from the start point to the length of data for grid search.

Only valid for when optimizer is set as 'brute'.

Defaults to 20.

optimize_stepfloat, optional

Specifies the minimum step for each iteration of LBFGSB method.

Defaults to 0.001.

accuracy_measurestr or a list of str, optional

The metric to quantify how well a model fits input data. Options: 'mse', 'rmse', 'mae', 'mape', 'smape', 'mase'.

Defaults to 'mse'.

Note

Specify a measure name if you want the corresponding measure value to be reflected in the output statistics (The second DataFrame in the return).

ignore_zerobool, optional
  • False: Uses zero values in the input dataset when calculating 'mape'.

  • True: Ignores zero values in the input dataset when calculating 'mape'.

Only valid when accuracy_measure is 'mape'.

Defaults to False.

expost_flagbool, optional
  • False: Does not output the expost forecast, and just outputs the forecast values.

  • True: Outputs the expost forecast and the forecast values.

Defaults to True.

thread_ratiofloat, optional

Adjusts the percentage of available threads to use, from 0 to 1. A value of 0 indicates the use of a single thread, while 1 implies the use of all possible current threads. Values outside the range will be ignored and this function heuristically determines the number of threads to use.

Defaults to 0.

iter_countint, optional

A positive integer that controls the iteration of the simulated annealing.

Defaults to 1000.

random_stateint, optional

Specifies the seed for random number generator. Valid for Simulated annealing method.

Defaults to 1.

penaltyfloat, optional

A penalty is applied to the cost function to avoid over-fitting.

Defaults to 1.0.

Returns:
A tuple of two DataFrames
  • the 1st DateFrame stores forecast values.

  • the 2nd DataFrame stores related statistics.

Examples

>>> forecasts, stats = intermittent_forecast(data=df, p=3, forecast_num=3,
                                             optimizer='lbfgsb_grid', grid_size=20,
                                             optimize_step = 0.011, expost_flag=False,
                                             accuracy_measure='mse', ignore_zero=False,
                                             thread_ratio=0.5)

Output:

>>> forecasts.collect()
>>> stats.collect()