MassiveUnifiedTimeSeries¶
- class hana_ml.algorithms.pal.unified_timeseries.MassiveUnifiedTimeSeries(func, group_params=None, **kwargs)¶
The Python wrapper for SAP HANA PAL Massive Unified Time Series function. The Massive Unified Time Series algorithms include:
Additive Model Time Series Analysis (AMTSA)
Auto Regressive Integrated Moving Average (ARIMA)
Bayesian Structural Time Series (BSTS)
Exponential Smoothing (SMOOTH)
- Parameters
- funcstr
The name of a specified time series algorithm.
The following algorithms are supported:
'AMTSA': Additive Model Time Series Analysis
'ARIMA': Auto Regressive Integrated Moving Average
'BSTS': Bayesian Structural Time Series
'SMOOTH': Auto Exponential Smoothing
- **kwargskeyword arguments
Arbitrary keyword arguments and please referred to the responding algorithm for the parameters' key-value pair.
'AMTSA' :
AdditiveModelTimeSeriesAnalysisAMTSA in UnifiedTimeSeries has some additional parameters, please see the following section.target_type : str, optional
start_point : str, optional, specify a start point in type conversion
interval : int, optional, specify an interval in type conversion.
holiday : str, optional, add holiday to model in a json format, including name, timestamp, (optional) lower_window, and (optional) upper_window elements. For example: '{ "name": "New Year", "timestamp": "2025-01-01" }'
'ARIMA' :
AutoARIMA'BSTS' :
BSTS'SMOOTH' :
AutoExponentialSmoothing
For more parameter mappings of hana_ml and HANA PAL, please refer to the doc page: Parameter Mappings
- Attributes
- model_DataFrame
Model information.
- statistics_DataFrame
Statistics.
- decompose_DataFrame
Decomposition values.
- error_msg_DataFrame
Error message during the fit process.
Methods
fit(data[, group_key, key, endog, exog])Fit function.
make_future_dataframe([data, key, ...])Create a new dataframe for time series prediction.
predict(data[, group_key, group_params, ...])Predict function.
Examples
>>> muts = MassiveUnifiedTimeSeries(func='AMTSA')
Perform fit():
>>> muts.fit(data=df, group_key='group_id', key="ID", endog='value', exog=["ex1", "ex2"])
Attributes after fit:
>>> muts.statistics_.collect() >>> muts.decompose_.collect() >>> muts.error_msg_.collect()
Invoke predict():
>>> forecast, decompose, error_msg = muts.predict(data=df_pred, group_key='group_id', key="ID", exog=["ex1", "ex2"])
Output:
>>> forecast.collect() >>> decompose.collect() >>> error_msg.collect()
- fit(data, group_key=None, key=None, endog=None, exog=None)¶
Fit function.
- Parameters
- dataDataFrame
Training data.
- group_keystr, optional
The column of group_key. Data type can be INT or NVARCHAR/VARCHAR.
Defaults to the first column of data if the index columns of data is not provided. Otherwise, defaults to the first column of index columns.
- keystr, optional
Name of ID column.
Defaults to the first column of data if the index column of data is not provided and
group_keycolumn is eliminated. Otherwise, defaults to the second index column of data.- endogstr, optional
The column of time series to be fitted and predicted.
Defaults to the first column of data after eliminating key and group_key columns.
- exogstr or list of str, optional
The column(s) of exogenous regressors.
If not specified, all columns except group_key, key and endog are treated as exogenous regressors.
- make_future_dataframe(data=None, key=None, group_key=None, periods=1, increment_type='seconds')¶
Create a new dataframe for time series prediction.
- Parameters
- dataDataFrame, optional
The training data contains the index.
Defaults to the data used in the fit().
- keystr, optional
The index defined in the training data.
Defaults to the specified key in fit() or the value in data.index or the PAL's default key column position.
- group_keystr, optional
Specify the group id column.
This parameter is only valid when
massiveis True.Defaults to the specified group_key in fit() or the first column of the dataframe.
- periodsint, optional
The number of rows created in the predict dataframe.
Defaults to 1.
- increment_type{'seconds', 'days', 'months', 'years'}, optional
The increment type of the time series.
Defaults to 'seconds'.
- Returns
- DataFrame
- predict(data, group_key=None, group_params=None, key=None, exog=None, **kwargs)¶
Predict function.
- Parameters
- dataDataFrame
Predict data.
- group_keystr, optional
The column of group_key. Data type can be INT or NVARCHAR/VARCHAR.
Defaults to the first column of data if the index columns of data is not provided. Otherwise, defaults to the first column of index columns.
- keystr, optional
Name of ID column.
Defaults to the first column of data if the index column of data is not provided and
group_keycolumn is eliminated. Otherwise, defaults to the second index column of data.- exogstr or list of str, optional
The column(s) of exogenous regressors.
If not specified, all columns except key and endog are treated as exogenous regressors.
- group_paramsdict, optional
The input data for time series shall be divided into different groups with different time series parameters applied. This parameter specifies the parameter values of the chosen time series algorithm
funcw.r.t. different groups in a dict format, where keys corresponding togroup_keywhile values should be a dict for time series algorithm parameter value assignments.