GARCH
- class hana_ml.algorithms.pal.tsa.garch.GARCH(p=None, q=None, model_type=None)
Generalized AutoRegressive Conditional Heteroskedasticity (GARCH) is a statistic model used to analysis variance of error (innovation or residual) term in time series. It is typically used in the analyzing of financial data, such as estimation of volatility of returns for stocks and bonds.
GARCH assumes variance of error term is heteroskedastic which means it is not a constant value. In appearance, it tends to cluster. GARCH assumes variance of error term subjects to an autoregressive moving average(ARMA) pattern, in other words it is an average of past values.
Assuming a time-series model:
\(y_t = \mu_t + \varepsilon_t\)
where \(\mu_t\) is called mean model(can be an ARMA model or just a constant value), it is \(\sigma_t^2 = var(\varepsilon_t|F_{t-1})\) (i.e. the conditional variance of \(\varepsilon_t\)) that causes the main interest, where \(F_{t-1}\) stands for the information set known at time t-1.
Then, a GARCH(p, q) model is defined as:
\(\sigma_t^2 = \alpha_0+\sum_{i=1}^p\alpha_i\varepsilon_{t-i}^2+\sum_{j=1}^q\beta_j\sigma_{t-j}^2\),
where \(\alpha_0 > 0\) and \(\alpha_i \geq 0, \beta_j\geq 0, i \in [1, p], j \in [1, q].\)
In our procedure, it is assumed that \(\mu_t\) has already been deducted from \(y_t\). So the input time-series is \(\varepsilon_t\) only.
Another assumption is \(P(\varepsilon_t | F_{t-1}) \sim N(0,\sigma_t^2)\), so model factors can be estimated with MLE.
- Parameters
- pint, optional
Specifies the number of lagged error terms in GARCH model.
Valid only when
model_type
is not "igarch".Defaults to 1.
- qint, optional
Specifies the number of lagged variance terms in GARCH model.
Valid only when
model_type
is not "igarch".Defaults to 1.
- model_typestr, optional
Specifies the variant of GARCH model.
'garch' : the regular GARCH model.
'igarch' : the integrated GARCH model.
'tgarch' : the threshold GARCH model.
'egarch' : the exponential GARCH model.
Defaults to 'garch'.
Examples
Input data for GARCH modeling
>>> data.collect() TIME VAR1 VAR2 VAR3 0 1 2 0.17 A 1 2 2 0.19 A 2 3 2 0.28 A 3 4 2 0.35 A 4 5 2 1.04 A 5 6 2 1.12 A 6 7 2 1.99 A 7 8 2 0.73 A 8 9 2 0.50 A 9 10 2 0.32 A 10 11 2 0.40 A 11 12 2 0.38 A 12 13 2 0.33 A 13 14 2 0.39 A 14 15 2 0.98 A 15 16 2 0.70 A 16 17 2 0.89 A 17 18 2 1.21 A 18 19 2 1.32 A 19 20 2 1.10 A
Setting up hyper-parameters and train the GARCH model using the input data:
>>> gh = GARCH(p=1, q=1) >>> gh.fit(data=data, key='TIME', endog='VAR2') >>> gh.model_.collect() ROW_INDEX MODEL_CONTENT 0 0 {"garch":{"factors":[0.13309395260165602,1.060...
Predicting future volatility of the given time-series data:
>>> pred_res, _ = gh.predict(horizon=5) >>> pred_res.collect() STEP VARIANCE RESIDUAL 0 1 1.415806 None 1 2 1.633979 None 2 3 1.865262 None 3 4 2.110445 None 4 5 2.370360 None
- Attributes
- model_DataFrame
DataFrame for storing the fitted GARCH model, structured as follows:
1st column : ROW_INDEX, type INTEGER
2nd column : MODEL_CONTENT, type NCLOB
Set to None if GARCH model is not fitted.
- variance_DataFrame
For storing the variance information of the training data, structured as follows:
1st column : Same name and type as the index(timestamp) column in the training data.
2nd column : VARIANCE, type DOUBLE, representing the conditional variance of residual term.
3rd column : RESIDUAL, type DOUBLE, representing the residual value.
set to None if GARCH model is not fitted.
- stats_DataFrame
DataFrame for storing the related statistics in fitting GARCH model.
1st column : STAT_NAME, type NVARCHAR(1000)
2nd column : STAT_VALUE, type NVARCHAR(1000)
Methods
fit
(data[, key, endog, thread_ratio])The fit() function for GARCH model.
predict
([horizon])This function predicts variance of error terms in time series based on trained GARCH model.
- fit(data, key=None, endog=None, thread_ratio=None)
The fit() function for GARCH model.
- Parameters
- dataDataFrame
Input data for fitting a GARCH model.
data
should at least contain 2 columns described as follows:An index column of INTEGER or TIMESTAMP/DATE/SECONDDATE type, representing the time-order(i.e. timestamp).
An numerical column representing the values of time-series.
- keystr, optional
Specifies the name of index column in
data
.Mandatory if
data
is not indexed, or indexed by multiple columns.Defaults to the single index column of
data
if not provided.- endogstr, optional
Specifies the name of the columns holding values for time-series in
data
.Cannot be the
key
column.Defaults to the last non-key column in
data
.- thread_ratiofloat, optional
Specifies the ratio of available threads used for fitting the GRACH model.
0: single thread
0~1: percentage
Others: heuristically determined
Defaults to -1.
- Returns
- A fitted object of class "GARCH".
- predict(horizon=None)
This function predicts variance of error terms in time series based on trained GARCH model.
- Parameters
- dataDataFrame
Time-series data for predicting the variance of error terms, should contain at least 2 columns described as follows:
An index column of INTEGER/TIMESTAMP type, representing the time-order(i.e. timestamp).
An numerical column representing the values of time-series.
- horizonint, optional
Specifies the number of steps to be forecasted.
Defaults to 1.
- Returns
- Two DataFrames, with the 1st one storing the variance information and the 2nd one storing related statistics.
- property fit_hdbprocedure
Returns the generated hdbprocedure for fit.
- property predict_hdbprocedure
Returns the generated hdbprocedure for predict.