LTSF
- class hana_ml.algorithms.pal.tsa.ltsf.LTSF(num_levels=None, kernel_size=None, hidden_expansion=None, batch_size=None, num_epochs=None, random_seed=None, position_encoding=None, adjust_learning_rate=None, learning_rate=None, dropout_prob=None)
Long-Term Series Forecasting (LTSF).
Although traditional algorithms are capable of predicting values in the near future, their performance will deteriorate greatly when it comes to long-term series forecasting. With the help of deep learning, this function implements a novel neural network architecture to achieve the state-of-the-art performance among the PAL family.
- Parameters
- num_levelsint, optional
Number of levels in the network architecture. Note that if warm_start = True in fit(), this parameter is not valid.
Defaults to 2.
- kernel_sizeint, optional
Kernel size of Conv1d layer. Note that if warm_start = True in fit(), this parameter is not valid.
Defaults to 3.
- hidden_expansionint, optional
Expands the input channel size of Conv1d layer. Note that if warm_start = True in fit(), this parameter is not valid.
Defaults to 3.
- batch_sizeint, optional
Number of pieces of data for training in one iteration.
Defaults to 8.
- num_epochsint, optional
Number of training epochs.
Defaults to 1.
- random_seedint, optional
0 indicates using machine time as seed.
Defaults to 0.
- position_encoding: bool, optional
Position encoding adds extra positional embeddings to the training series.
False: Do not use. True: Use.
Defaults to True.
- adjust_learning_rate: bool, optional
Decays the learning rate to its half after every epoch.
False: Do not use. True: Use.
Defaults to True.
- learning_ratefloat, optional
Initinal learning rate for Adam optimizer.
Defaults to 0.005.
- dropout_probfloat, optional
Dropout probability of Dropout layer.
Defaults to 0.05.
Examples
Input dataframe is df_fit and create an instance of LTSF:
>>> ltsf = LTSF(num_levels = 2, kernel_size = 3, hidden_expansion = 3, batch_size = 8, num_epochs = 2, position_encoding = True, adjust_learning_rate = True, learning_rate = 0.005, dropout_prob = 0.2, random_seed = 1)
Performing fit() on the given dataframe:
>>> ltsf.fit(data=df.fit, train_length=32, forecast_length=16, key="TIME_STAMP", endog="TARGET", exog=["FEAT1", "FEAT2", "FEAT3", "FEAT4"]) >>> ltsf.loss_.collect() EPOCH BATCH LOSS 0 1 0 1.177407 1 1 1 0.925078 2 1 2 0.798042 3 1 3 0.712275 4 1 4 0.702966 5 1 5 0.703366 6 1 epoch average 0.836522 7 2 0 0.664331 8 2 1 0.608385 9 2 2 0.614841 10 2 3 0.626234 11 2 4 0.623597 12 2 5 0.571699 13 2 epoch average 0.618181
Input dataframe for predict is df_predict and performing predict() on given dataframe:
>>> result = ltsf.predict(data=df_predict) >>> result.collect() ID FORECAST 1 0 52.28396 2 1 57.03466 3 2 69.49162 4 3 68.06987 5 4 40.43507 6 5 55.53528 7 6 54.17256 8 7 39.32336 9 8 25.51410 10 9 102.11331 11 10 134.10745 12 11 48.32333 13 12 46.47223 14 13 72.44048 15 14 65.29192 16 15 69.33713
We also provide the continuous training which uses a parameter warm_start to control. The model used in the training is the attribute of model_ of a "LTSF" object. You could also use load_model() to load a trained model for continous training.
>>> ltsf.num_epochs = 1 >>> ltsf.dropout_prob = 0.2 >>> ltsf.learning_rate = 0.002 >>> ltsf.fit(df_fit, key="TIME_STAMP", endog="TARGET", exog=["FEAT1", "FEAT2", "FEAT3", "FEAT4"], warm_start=True)
- Attributes
- model_DataFrame
Trained model content.
- loss_DataFrame
Indicates the information of training loss either batch ID or average batch loss indicator.
Methods
fit
(data[, train_length, forecast_length, ...])Train a LTSF model with given parameters.
predict
(data[, key, endog, allow_new_index])Makes time series forecast based on a LTSF model.
- fit(data, train_length=None, forecast_length=None, key=None, endog=None, exog=None, warm_start=False)
Train a LTSF model with given parameters.
- Parameters
- dataDataFrame
Input data.
- train_lengthint
Length of training series inputted to the network. Note that if use warm_start mode, this parameter is not valid.
- forecast_lengthint
Length of predictions. Note that if use warm_start mode, this parameter is not valid.
- keystr, optional
The timestamp column of data. The type of key column should be INTEGER, TIMESTAMP, DATE or SECONDDATE.
Defaults to the first column of data if the index column of data is not provided. Otherwise, defaults to the index column of data.
- endogstr, optional
The endogenous variable, i.e. target time series. The type of endog column could be INTEGER, DOUBLE or DECIMAL(p,s).
Defaults to the first non-key column.
- exogstr or a list of str, optional
An optional array of exogenous variables. The type of exog column could be INTEGER, DOUBLE or DECIMAL(p,s).
Defaults to None. Please set this parameter explicitly if you have exogenous variables.
- warm_startbool, optional
When set to True, reuse the
model_
of current object to train. We provide a method called load_model to load other model. Otherwise, just to train a new model.Defaults to False.
- Returns
- A fitted object of class "LTSF".
- predict(data, key=None, endog=None, allow_new_index=True)
Makes time series forecast based on a LTSF model. The number of rows of input predict data must be equal to the value of train_length during training and the length of predicitons is equal to the value of forecast_length.
- Parameters
- dataDataFrame
Input data which constains the target time series and exogenous features (optional). Note that the data here requires a endog column to be the input of trained model for prediction.
- keystr, optional
Name of the ID column.
Mandatory if
data
is not indexed, or the index ofdata
contains multiple columns.Defaults to the single index column of
data
if not provided.- endogstr, optional
The endogenous variable (target time series), i.e. time series. The type of endog column could be INTEGER, DOUBLE or DECIMAL(p,s).
Note that endog represented the input value for a LTSF model to make prediction. The length of predicitons is equal to the value of forecast_length set in the fit().
Defaults to the first non-key column.
- allow_new_indexbool, optional
Indicate whether a new index column is allowed in the result. - True: return the result with new integer or timestamp index column. - False: return the result with index column starting from 0.
Defaults to True.
- Returns
- DataFrame
Forecasted values, structured as follows:
ID, type INTEGER, timestamp.
VALUE, type DOUBLE, forecast value.
- property fit_hdbprocedure
Returns the generated hdbprocedure for fit.
- property predict_hdbprocedure
Returns the generated hdbprocedure for predict.