OnlineBCPD
- class hana_ml.algorithms.pal.tsa.changepoint.OnlineBCPD(alpha=None, beta=None, kappa=None, mu=None, lamb=None, threshold=None, delay=None, prune=None)
Online Bayesian Change-point detection (OnlineBCPD).
- Parameters
- alphafloat, optional
Parameter of t-distribution.
Defaults to 0.1.
- betafloat, optional
Parameter of t-distribution.
Defaults to 0.01.
- kappafloat, optional
Parameter of t-distribution.
Defaults to 1.0.
- mufloat, optional
Parameter of t-distribution.
Defaults to 0.0.
- lambfloat, optional
Parameter of constant hazard function.
Defaults to 250.0.
- thresholdfloat, optional
Threshold to determine a change point:
0: Return the probability of change point for every time step.
0~1: Only return the time step of which the probability is above the threshold.
Defaults to 0.0.
- delayint, optional
Number of incoming time steps to determine whether the current time step is a change point.
Defaults to 3.
- prunebool, optional
Reduce the size of model table after every run:
False: Do not prune.
True: Prune.
Defaults to False.
Examples
Input Data:
>>> df.collect() ID VAL 0 0 9.926943 1 1 9.262971 2 2 9.715766 3 3 9.944334 4 4 9.577682 5 5 10.036977 6 6 9.513112 7 7 10.233246 8 8 10.159134 9 9 9.759518 .......
Create an OnlineBCPD instance:
>>> obcpd = OnlineBCPD(alpha=0.1, beta=0.01, kappa=1.0, mu=0.0, delay=5, threshold=0.5, prune=True)
Invoke fit_predict():
>>> model, cp = obcpd.fit_predict(data=df, model=None)
Output:
>>> print(model.head(5).collect()) ID ALPHA BETA KAPPA MU PROB 0 0 0.1 0.010000 1.0 0.000000 4.000000e-03 1 1 0.6 71.013179 2.0 8.426338 6.478577e-05 2 2 1.1 86.966340 3.0 10.732357 7.634862e-06 3 3 1.6 100.514641 4.0 12.235038 1.540977e-06 4 4 2.1 107.197565 5.0 13.052529 3.733699e-07 >>> print(cp.collect()) ID POSITION PROBABILITY 0 0 58 0.989308 1 1 249 0.991023 2 2 402 0.994154 3 3 539 0.981004 4 4 668 0.994708
- Attributes
fit_hdbprocedure
Returns the generated hdbprocedure for fit.
predict_hdbprocedure
Returns the generated hdbprocedure for predict.
Methods
fit_predict
(data[, key, endog, model])Detects change-points of the input data.
Gets the statistics.
- property fit_hdbprocedure
Returns the generated hdbprocedure for fit.
- property predict_hdbprocedure
Returns the generated hdbprocedure for predict.
- fit_predict(data, key=None, endog=None, model=None)
Detects change-points of the input data.
- Parameters
- dataDataFrame
Input time-series data for change-point detection.
- keystr, optional
Column name for time-stamp of the input time-series data.
If the index column of data is not provided or not a single column, and the key of fit_predict function is not provided, the default value is the first column of data.
If the index of data is set as a single column, the default value of key is index column of data.
- endogstr, optional
Column name for the value of the input time-series data. Defaults to the first non-key column.
- modelDataFrame, optional
The model for change point detection.
Defaults to self.model_ (the default value of self.model_ is None).
- Returns
- DataFrame 1
Model.
- DataFrame 2
The detected change points.
- get_stats()
Gets the statistics.
- Returns
- DataFrame
Statistics.
Inherited Methods from PALBase
Besides those methods mentioned above, the OnlineBCPD class also inherits methods from PALBase class, please refer to PAL Base for more details.