LTSF
- class hana_ml.algorithms.pal.tsa.ltsf.LTSF(batch_size=None, num_epochs=None, random_seed=None, network_type=None, adjust_learning_rate=None, learning_rate=None, num_levels=None, kernel_size=None, hidden_expansion=None, position_encoding=None, dropout_prob=None)
Long-term time series forecasting (LTSF) is a specialized approach within the realm of predictive analysis, focusing on making predictions for extended periods into the long future. Although traditional algorithms are capable of predicting values in the near future, their performance will deteriorate greatly when it comes to long-term series forecasting. With the help of deep learning, this function implements a novel neural network architecture to achieve the state-of-the-art performance among the PAL family.
- Parameters:
- network_typestr, optional
The type of network:
'NLinear'
'DLinear'
'XLinear'
'SCINet'
'RLinear'
'RMLP'
Defaults to 'NLinear'.
- batch_sizeint, optional
The number of pieces of data for training in one iteration.
Defaults to 8.
- num_epochsint, optional
The number of training epochs.
Defaults to 1.
- random_seedint, optional
0 indicates using machine time as seed.
Defaults to 0.
- adjust_learning_rate: bool, optional
Decays the learning rate to its half after every epoch.
False: Do not use.
True: Use.
Defaults to True.
- learning_ratefloat, optional
The initial learning rate for Adam optimizer.
Defaults to 0.005.
- num_levelsint, optional
The number of levels in the network architecture. This parameter is valid when
network_type
is 'SCINet'.Note that if
warm_start = True
in fit(), then this parameter is not valid.Defaults to 2.
- kernel_sizeint, optional
Kernel size of Conv1d layer. This parameter is valid when
network_type
is 'SCINet'.Note that if
warm_start = True
in fit(), then this parameter is not valid.Defaults to 3.
- hidden_expansionint, optional
Expands the input channel size of Conv1d layer. This parameter is valid when
network_type
is 'SCINet'. Note that ifwarm_start = True
in fit(), then this parameter is not valid.Defaults to 3.
- position_encoding: bool, optional
Position encoding adds extra positional embeddings to the training series.
False: Do not use.
True: Use.
This parameter is valid when
network_type
is 'SCINet'.Defaults to True.
- dropout_probfloat, optional
Dropout probability of Dropout layer. This parameter is valid when
network_type
is 'SCINet'.Defaults to 0.05.
Examples
Input DataFrame is df_fit and create an instance of LTSF:
>>> ltsf = LTSF(batch_size = 8, num_epochs = 2, adjust_learning_rate = True, learning_rate = 0.005, random_seed = 1)
Performing fit():
>>> ltsf.fit(data=df_fit, train_length=32, forecast_length=16, key="TIME_STAMP", endog="TARGET", exog=["FEAT1", "FEAT2", "FEAT3", "FEAT4"]) >>> ltsf.loss_.collect() EPOCH BATCH LOSS 0 1 0 1.177407 1 1 1 0.925078 ... 12 2 5 0.571699 13 2 epoch average 0.618181
Input DataFrame df_predict and perform predict():
>>> result = ltsf.predict(data=df_predict) >>> result.collect() ID FORECAST 1 0 52.28396 2 1 57.03466 ... 16 15 69.33713
We also provide the continuous training which uses a parameter warm_start to control. The model used in the training is the attribute of model_ of a "LTSF" object. You could also use load_model() to load a trained model for continous training.
>>> ltsf.num_epochs = 2 >>> ltsf.learning_rate = 0.002 >>> ltsf.fit(data=df_fit, key="TIME_STAMP", endog="TARGET", exog=["FEAT1", "FEAT2", "FEAT3", "FEAT4"], warm_start=True)
- Attributes:
- model_DataFrame
Model content.
- loss_DataFrame
Indicates the information of training loss either batch ID or average batch loss indicator.
- explainer_DataFrame
The explanations with decomposition of exogenous variables. The attribute only appear when
show_explainer=True
andnetwork_type
is 'XLinear' in the predict() function.- permutation_importance_DataFrame
The importance of exogenous variables as determined by permutation importance analysis. The attribute only appear when invoking get_permutation_importance() function after a trained model is obtained, structured as follows:
1st column : PAIR, measure name.
2nd column : NAME, exogenous regressor name.
3rd column : VALUE, the importance of the exogenous regressor.
Methods
fit
(data[, train_length, forecast_length, ...])Fit the model to the training dataset.
get_permutation_importance
(data[, model, ...])Please see Permutation Feature Importance for Time Series for details.
predict
(data[, key, endog, allow_new_index, ...])Generates time series forecasts based on the fitted model.
- fit(data, train_length=None, forecast_length=None, key=None, endog=None, exog=None, warm_start=False)
Fit the model to the training dataset.
- Parameters:
- dataDataFrame
Input data.
- train_lengthint
Length of training series inputted to the network.
Note that if
warm_start = True
, then this parameter is not valid.- forecast_lengthint
Length of predictions.
The constraint is that
train_length + forecat_length <= data.count()`
.Note that if
warm_start = True
, then this parameter is not valid.- keystr, optional
The timestamp column of data. The type of key column should be INTEGER, TIMESTAMP, DATE or SECONDDATE.
Defaults to the first column of data if the index column of data is not provided. Otherwise, defaults to the index column of data.
- endogstr, optional
The endogenous variable, i.e. target time series. The type of endog column could be INTEGER, DOUBLE or DECIMAL(p,s).
Defaults to the first non-key column.
- exogstr or a list of str, optional
An optional array of exogenous variables. The type of exog column could be INTEGER, DOUBLE or DECIMAL(p,s).
Defaults to None. Please set this parameter explicitly if you have exogenous variables.
- warm_startbool, optional
When set to True, reuse the
model_
of current object to continuously train the model. We provide a method called load_model() to load a pretrain model. Otherwise, just to train a new model.Defaults to False.
- Returns:
- A fitted object of class "LTSF".
- predict(data, key=None, endog=None, allow_new_index=True, show_explainer=False, reference_dict=None)
Generates time series forecasts based on the fitted model. The number of rows of input predict data must be equal to the value of
train_length
during training and the length of predictions is equal to the value offorecast_length
.- Parameters:
- dataDataFrame
Input data for making forecasts.
Formally,
data
should contain an ID column, the target time series and exogenous features specified in the training phase(i.e.endog
andexog
in fit() function), but no other columns.The length of
data
must be equal to the value of parametertrain_length
in fit().- keystr, optional
Name of the ID column.
Mandatory if
data
is not indexed, or the index ofdata
contains multiple columns.Defaults to the single index column of
data
if not provided.- endogstr, optional
The endogenous variable, i.e. target time series. The type of endog column could be INTEGER, DOUBLE or DECIMAL(p,s).
Defaults to the first non-key column of
data
.- allow_new_indexbool, optional
Indicates whether a new index column is allowed in the result. - True: return the result with new integer or timestamp index column. - False: return the result with index column starting from 0.
Defaults to True.
- show_explainerbool, optional
Indicates whether to invoke the LTSF with explanations function in the predict.
If True, the contributions of each exog and its value and percentage are shown in a attribute called explainer_ of a LTSF instance.
Only valid when
network_type
is 'XLinear'.Defaults to False.
- reference_dictdict, optional
Define the reference value of an exogenous variable. The type of reference value need to be the same as the type of exogenous variable.
Only valid when
show_explainer
is True.Defaults to the average value of exogenous variable in the training data if not provided.
- Returns:
- DataFrame 1
Forecasted values, structured as follows:
ID: type INTEGER, timestamp.
VALUE: type DOUBLE, forecast value.
- DataFrame 2 (optional)
The explanations with decomposition of exogenous variables. Only valid if
show_explainer
is True andnetwork_type
is 'XLinear'.
- get_permutation_importance(data, model=None, key=None, endog=None, exog=None, repeat_time=None, random_state=None, thread_ratio=None, partition_ratio=None, regressor_top_k=None, accuracy_measure=None, ignore_zero=None)
Please see Permutation Feature Importance for Time Series for details.
Inherited Methods from PALBase
Besides those methods mentioned above, the LTSF class also inherits methods from PALBase class, please refer to PAL Base for more details.