MassiveUnifiedTimeSeries

class hana_ml.algorithms.pal.unified_timeseries.MassiveUnifiedTimeSeries(func, group_params=None, **kwargs)

The Python wrapper for SAP HANA PAL Massive Unified Time Series function. The Massive Unified Time Series algorithms include:

  • Additive Model Time Series Analysis (AMTSA)

  • Auto Regressive Integrated Moving Average (ARIMA)

  • Bayesian Structural Time Series (BSTS)

  • Exponential Smoothing (SMOOTH)

Parameters
funcstr

The name of a specified time series algorithm.

The following algorithms are supported:

  • 'AMTSA': Additive Model Time Series Analysis

  • 'ARIMA': Auto Regressive Integrated Moving Average

  • 'BSTS': Bayesian Structural Time Series

  • 'SMOOTH': Auto Exponential Smoothing

**kwargskeyword arguments

Arbitrary keyword arguments and please referred to the responding algorithm for the parameters' key-value pair.

  • 'AMTSA' : AdditiveModelTimeSeriesAnalysis AMTSA in UnifiedTimeSeries has some additional parameters, please see the following section.

    • target_type : str, optional

    • start_point : str, optional, specify a start point in type conversion

    • interval : int, optional, specify an interval in type conversion.

    • holiday : str, optional, add holiday to model in a json format, including name, timestamp, (optional) lower_window, and (optional) upper_window elements. For example: '{ "name": "New Year", "timestamp": "2025-01-01" }'

  • 'ARIMA' : AutoARIMA

  • 'BSTS' : BSTS

  • 'SMOOTH' : AutoExponentialSmoothing

For more parameter mappings of hana_ml and HANA PAL, please refer to the doc page: Parameter Mappings

Attributes
model_DataFrame

Model information.

statistics_DataFrame

Statistics.

decompose_DataFrame

Decomposition values.

error_msg_DataFrame

Error message during the fit process.

Methods

fit(data[, group_key, key, endog, exog])

Fit function.

make_future_dataframe([data, key, ...])

Create a new dataframe for time series prediction.

predict(data[, group_key, group_params, ...])

Predict function.

Examples

>>> muts = MassiveUnifiedTimeSeries(func='AMTSA')

Perform fit():

>>> muts.fit(data=df, group_key='group_id', key="ID", endog='value', exog=["ex1", "ex2"])

Attributes after fit:

>>> muts.statistics_.collect()
>>> muts.decompose_.collect()
>>> muts.error_msg_.collect()

Invoke predict():

>>> forecast, decompose, error_msg = muts.predict(data=df_pred, group_key='group_id', key="ID", exog=["ex1", "ex2"])

Output:

>>> forecast.collect()
>>> decompose.collect()
>>> error_msg.collect()
fit(data, group_key=None, key=None, endog=None, exog=None)

Fit function.

Parameters
dataDataFrame

Training data.

group_keystr, optional

The column of group_key. Data type can be INT or NVARCHAR/VARCHAR.

Defaults to the first column of data if the index columns of data is not provided. Otherwise, defaults to the first column of index columns.

keystr, optional

Name of ID column.

Defaults to the first column of data if the index column of data is not provided and group_key column is eliminated. Otherwise, defaults to the second index column of data.

endogstr, optional

The column of time series to be fitted and predicted.

Defaults to the first column of data after eliminating key and group_key columns.

exogstr or list of str, optional

The column(s) of exogenous regressors.

If not specified, all columns except group_key, key and endog are treated as exogenous regressors.

make_future_dataframe(data=None, key=None, group_key=None, periods=1, increment_type='seconds')

Create a new dataframe for time series prediction.

Parameters
dataDataFrame, optional

The training data contains the index.

Defaults to the data used in the fit().

keystr, optional

The index defined in the training data.

Defaults to the specified key in fit() or the value in data.index or the PAL's default key column position.

group_keystr, optional

Specify the group id column.

This parameter is only valid when massive is True.

Defaults to the specified group_key in fit() or the first column of the dataframe.

periodsint, optional

The number of rows created in the predict dataframe.

Defaults to 1.

increment_type{'seconds', 'days', 'months', 'years'}, optional

The increment type of the time series.

Defaults to 'seconds'.

Returns
DataFrame
predict(data, group_key=None, group_params=None, key=None, exog=None, **kwargs)

Predict function.

Parameters
dataDataFrame

Predict data.

group_keystr, optional

The column of group_key. Data type can be INT or NVARCHAR/VARCHAR.

Defaults to the first column of data if the index columns of data is not provided. Otherwise, defaults to the first column of index columns.

keystr, optional

Name of ID column.

Defaults to the first column of data if the index column of data is not provided and group_key column is eliminated. Otherwise, defaults to the second index column of data.

exogstr or list of str, optional

The column(s) of exogenous regressors.

If not specified, all columns except key and endog are treated as exogenous regressors.

group_paramsdict, optional

The input data for time series shall be divided into different groups with different time series parameters applied. This parameter specifies the parameter values of the chosen time series algorithm func w.r.t. different groups in a dict format, where keys corresponding to group_key while values should be a dict for time series algorithm parameter value assignments.

Inherited Methods from PALBase

Besides those methods mentioned above, the MassiveUnifiedTimeSeries class also inherits methods from PALBase class, please refer to PAL Base for more details.