LSTM
- class hana_ml.algorithms.pal.tsa.lstm.LSTM(learning_rate=None, gru=None, batch_size=None, time_dim=None, hidden_dim=None, num_layers=None, max_iter=None, interval=None, optimizer_type=None, stateful=None, bidirectional=None)
Long short-term memory (LSTM).
- Parameters
- learning_ratefloat, optional
Learning rate for gradient descent
Defaults to 0.01.
- gru{'gru', 'lstm'}, optional
Choose GRU or LSTM.
Defaults to 'lstm'.
- batch_sizeint, optional
Number of pieces of data for training in one iteration.
Defaults to 32.
- time_dimint, optional
It specifies how many time steps in a sequence that will be trained by LSTM/GRU and then for time series prediction.
The value of it must be smaller than the length of input time series minus 1.
Defaults to 16.
- hidden_dimint, optional
Number of hidden neuron in LSTM/GRU unit.
Defaults to 128.
- num_layersint, optional
Number of layers in LSTM/GRU unit.
Defaults to 1.
- max_iterint, optional
Number of batches of data by which LSTM/GRU is trained.
Defaults to 1000.
- intervalint, optional
Output the average loss within every INTERVAL iterations.
Defaults to 100.
- optimizer_type{'SGD', 'RMSprop', 'Adam', 'Adagrad'}, optional
Choose the optimizer.
Defaults to 'Adam'.
- statefulbool, optional
If the value is True, it enables stateful LSTM/GRU.
Defaults to True.
- bidirectionalbool, optional
If the value is True, it uses BiLSTM/BiGRU. Otherwise, it uses LSTM/GRU.
Defaults to False.
Examples
Input dataframe df:
>>> df.head(3).collect() TIMESTAMP SERIES 0 0 20.7 1 1 17.9 2 2 18.8
Create LSTM model:
>>> lstm = lstm.LSTM(gru='lstm', bidirectional=False, time_dim=16, max_iter=1000, learning_rate=0.01, batch_size=32, hidden_dim=128, num_layers=1, interval=1, stateful=False, optimizer_type='Adam')
Perform fit on the given data:
>>> lstm.fit(self.df)
Perform predict on the fitted model:
>>> res = lstm.predict(self.df_predict)
Expected output:
>>> res.head(3).collect() ID VALUE REASON_CODE 0 0 11.673560 [{"attr":"T=0","pct":28.926935203430372,"val":... 1 1 14.057195 [{"attr":"T=3","pct":24.729787064691735,"val":... 2 2 15.119411 [{"attr":"T=2","pct":41.616207151605458,"val":...
- Attributes
- loss_DateFrame
LOSS.
- model_DataFrame
Model content.
Methods
Generate time series report.
fit
(data[, key, endog, exog])Generates LSTM models with given parameters.
generate_html_report
([filename])Display function.
Display function.
predict
(data[, top_k_attributions])Makes time series forecast based on the LSTM model.
- fit(data, key=None, endog=None, exog=None)
Generates LSTM models with given parameters.
- Parameters
- dataDataFrame
Input data, structured as follows.
The 1st column : index/timestamp, type INTEGER.
The 2nd column : time-series value, type INTEGER, DOUBLE, or DECIMAL(p,s).
Other columns : external data(regressors), type INTEGER, DOUBLE, DECIMAL(p,s), VARCHAR or NVARCHAR.
- keystr, optional
The timestamp column of data. The type of key column is INTEGER.
Defaults to the first column of data if the index column of data is not provided. Otherwise, defaults to the index column of data.
- endogstr, optional
The endogenous variable, i.e. time series. The type of endog column is INTEGER, DOUBLE, or DECIMAL(p, s).
Defaults to the first non-key column of data if not provided.
- exogstr or a list of str, optional
An optional array of exogenous variables. The type of exog column is INTEGER, DOUBLE, or DECIMAL(p, s).
Defaults to None. Please set this parameter explicitly if you have exogenous variables.
- Returns
- A fitted object of class "LSTM".
- predict(data, top_k_attributions=None)
Makes time series forecast based on the LSTM model.
- Parameters
- dataDataFrame
Data for prediction. Every row in the
data
should contain one piece of record data for prediction, i.e. it should be structured as follows:First column: Record ID, type INTEGER.
Other columns : Time-series and external data values, arranged in time order.
The number of all columns but the first id column should be equal to the value of
time_dim
* (M-1), where M is the number of columns of the input data in the training phase.- top_k_attributionsint, optional
Specifies the number of features with highest attributions to output.
Defaults to 10 or 0 depending on the SAP HANA version.
- Returns
- DataFrame
The aggregated forecasted values. Forecasted values, structured as follows:
ID, type INTEGER, timestamp.
VALUE, type DOUBLE, forecast value.
REASON_CODE, type NCLOB, Sorted SHAP values for test data at each time step.
- build_report()
Generate time series report.
- generate_html_report(filename=None)
Display function.
- generate_notebook_iframe_report()
Display function.
- property fit_hdbprocedure
Returns the generated hdbprocedure for fit.
- property predict_hdbprocedure
Returns the generated hdbprocedure for predict.