LTSF

class hana_ml.algorithms.pal.tsa.ltsf.LTSF(batch_size=None, num_epochs=None, random_seed=None, network_type=None, adjust_learning_rate=None, learning_rate=None, num_levels=None, kernel_size=None, hidden_expansion=None, position_encoding=None, dropout_prob=None)

Long-Term Series Forecasting (LTSF).

Although traditional algorithms are capable of predicting values in the near future, their performance will deteriorate greatly when it comes to long-term series forecasting. With the help of deep learning, this function implements a novel neural network architecture to achieve the state-of-the-art performance among the PAL family.

Parameters:
network_typestr, optional

The type of network:

  • 'NLinear'

  • 'DLinear'

  • 'XLinear'

  • 'SCINet'

Defaults to 'NLinear'.

batch_sizeint, optional

Number of pieces of data for training in one iteration.

Defaults to 8.

num_epochsint, optional

Number of training epochs.

Defaults to 1.

random_seedint, optional

0 indicates using machine time as seed.

Defaults to 0.

adjust_learning_rate: bool, optional

Decays the learning rate to its half after every epoch.

  • False: Do not use.

  • True: Use.

Defaults to True.

learning_ratefloat, optional

Initial learning rate for Adam optimizer.

Defaults to 0.005.

num_levelsint, optional

Number of levels in the network architecture.

This parameter is valid when network_type is 'SCINet'.

Note that if warm_start = True in fit(), then this parameter is not valid.

Defaults to 2.

kernel_sizeint, optional

Kernel size of Conv1d layer.

This parameter is valid when network_type is 'SCINet'.

Note that if warm_start = True in fit(), then this parameter is not valid.

Defaults to 3.

hidden_expansionint, optional

Expands the input channel size of Conv1d layer.

This parameter is valid when network_type is 'SCINet'.

Note that if warm_start = True in fit(), then this parameter is not valid.

Defaults to 3.

position_encoding: bool, optional

Position encoding adds extra positional embeddings to the training series.

  • False: Do not use.

  • True: Use.

This parameter is valid when network_type is 'SCINet'.

Defaults to True.

dropout_probfloat, optional

Dropout probability of Dropout layer.

This parameter is valid when network_type is 'SCINet'.

Defaults to 0.05.

Examples

Input dataframe is df_fit and create an instance of LTSF:

>>> ltsf = LTSF(batch_size = 8,
                num_epochs = 2,
                adjust_learning_rate = True,
                learning_rate = 0.005,
                random_seed = 1)

Performing fit() on the given dataframe:

>>> ltsf.fit(data=df.fit,
             train_length=32,
             forecast_length=16,
             key="TIME_STAMP",
             endog="TARGET",
             exog=["FEAT1", "FEAT2", "FEAT3", "FEAT4"])
>>> ltsf.loss_.collect()
    EPOCH          BATCH      LOSS
0       1              0  1.177407
1       1              1  0.925078
2       1              2  0.798042
3       1              3  0.712275
4       1              4  0.702966
5       1              5  0.703366
6       1  epoch average  0.836522
7       2              0  0.664331
8       2              1  0.608385
9       2              2  0.614841
10      2              3  0.626234
11      2              4  0.623597
12      2              5  0.571699
13      2  epoch average  0.618181

Input dataframe for predict is df_predict and performing predict() on given dataframe:

>>> result = ltsf.predict(data=df_predict)
>>> result.collect()
   ID  FORECAST
1   0  52.28396
2   1  57.03466
3   2  69.49162
4   3  68.06987
5   4  40.43507
6   5  55.53528
7   6  54.17256
8   7  39.32336
9   8  25.51410
10  9 102.11331
11 10 134.10745
12 11  48.32333
13 12  46.47223
14 13  72.44048
15 14  65.29192
16 15  69.33713

We also provide the continuous training which uses a parameter warm_start to control. The model used in the training is the attribute of model_ of a "LTSF" object. You could also use load_model() to load a trained model for continous training.

>>> ltsf.num_epochs    = 2
>>> ltsf.learning_rate = 0.002
>>> ltsf.fit(df_fit,
             key="TIME_STAMP",
             endog="TARGET",
             exog=["FEAT1", "FEAT2", "FEAT3", "FEAT4"],
             warm_start=True)
Attributes:
model_DataFrame

Trained model content.

loss_DataFrame

Indicates the information of training loss either batch ID or average batch loss indicator.

Methods

fit(data[, train_length, forecast_length, ...])

Train a LTSF model with given parameters.

predict(data[, key, endog, allow_new_index])

Makes time series forecast based on a LTSF model.

fit(data, train_length=None, forecast_length=None, key=None, endog=None, exog=None, warm_start=False)

Train a LTSF model with given parameters.

Parameters:
dataDataFrame

Input data.

train_lengthint

Length of training series inputted to the network.

Note that if warm_start = True, then this parameter is not valid.

forecast_lengthint

Length of predictions.

The constraint is that train_length + forecat_length <= data.count()`.

Note that if warm_start = True, then this parameter is not valid.

keystr, optional

The timestamp column of data. The type of key column should be INTEGER, TIMESTAMP, DATE or SECONDDATE.

Defaults to the first column of data if the index column of data is not provided. Otherwise, defaults to the index column of data.

endogstr, optional

The endogenous variable, i.e. target time series. The type of endog column could be INTEGER, DOUBLE or DECIMAL(p,s).

Defaults to the first non-key column.

exogstr or a list of str, optional

An optional array of exogenous variables. The type of exog column could be INTEGER, DOUBLE or DECIMAL(p,s).

Defaults to None. Please set this parameter explicitly if you have exogenous variables.

warm_startbool, optional

When set to True, reuse the model_ of current object to continuously train the model. We provide a method called load_model() to load a pretrain model. Otherwise, just to train a new model.

Defaults to False.

Returns:
A fitted object of class "LTSF".
predict(data, key=None, endog=None, allow_new_index=True)

Makes time series forecast based on a LTSF model. The number of rows of input predict data must be equal to the value of train_length during training and the length of predictions is equal to the value of forecast_length.

Parameters:
dataDataFrame

Input data for making forecasts.

Formally, data should contain an ID column, the target time series and exogenous features specified in the training phase(i.e. endog and exog in fit() function), but no other columns.

The length of data must be equal to the value of parameter train_length in fit().

keystr, optional

Name of the ID column.

Mandatory if data is not indexed, or the index of data contains multiple columns.

Defaults to the single index column of data if not provided.

endogstr, optional

The endogenous variable, i.e. target time series. The type of endog column could be INTEGER, DOUBLE or DECIMAL(p,s).

Defaults to the first non-key column of data.

allow_new_indexbool, optional

Indicates whether a new index column is allowed in the result. - True: return the result with new integer or timestamp index column. - False: return the result with index column starting from 0.

Defaults to True.

Returns:
DataFrame

Forecasted values, structured as follows:

  • ID, type INTEGER, timestamp.

  • VALUE, type DOUBLE, forecast value.

property fit_hdbprocedure

Returns the generated hdbprocedure for fit.

property predict_hdbprocedure

Returns the generated hdbprocedure for predict.

Inherited Methods from PALBase

Besides those methods mentioned above, the LTSF class also inherits methods from PALBase class, please refer to PAL Base for more details.