OnlineBCPD

class hana_ml.algorithms.pal.tsa.changepoint.OnlineBCPD(alpha=None, beta=None, kappa=None, mu=None, lamb=None, threshold=None, delay=None, prune=None)

Online Bayesian Change-point detection (OnlineBCPD).

Parameters
alphafloat, optional

Parameter of t-distribution.

Defaults to 0.1.

betafloat, optional

Parameter of t-distribution.

Defaults to 0.01.

kappafloat, optional

Parameter of t-distribution.

Defaults to 1.0.

mufloat, optional

Parameter of t-distribution.

Defaults to 0.0.

lambfloat, optional

Parameter of constant hazard function.

Defaults to 250.0.

thresholdfloat, optional

Threshold to determine a change point:

  • 0: Return the probability of change point for every time step.

  • 0~1: Only return the time step of which the probability is above the threshold.

Defaults to 0.0.

delayint, optional

Number of incoming time steps to determine whether the current time step is a change point.

Defaults to 3.

prunebool, optional

Reduce the size of model table after every run:

  • False: Do not prune.

  • True: Prune.

Defaults to False.

Examples

Input Data:

>>> df.collect()
   ID        VAL
0   0   9.926943
1   1   9.262971
2   2   9.715766
3   3   9.944334
4   4   9.577682
5   5  10.036977
6   6   9.513112
7   7  10.233246
8   8  10.159134
9   9   9.759518
.......

Create an OnlineBCPD instance:

>>> obcpd = OnlineBCPD(alpha=0.1,
                       beta=0.01,
                       kappa=1.0,
                       mu=0.0,
                       delay=5,
                       threshold=0.5,
                       prune=True)

Invoke fit_predict():

>>> model, cp = obcpd.fit_predict(data=df, model=None)

Output:

>>> print(model.head(5).collect())
   ID  ALPHA        BETA  KAPPA         MU          PROB
0   0    0.1    0.010000    1.0   0.000000  4.000000e-03
1   1    0.6   71.013179    2.0   8.426338  6.478577e-05
2   2    1.1   86.966340    3.0  10.732357  7.634862e-06
3   3    1.6  100.514641    4.0  12.235038  1.540977e-06
4   4    2.1  107.197565    5.0  13.052529  3.733699e-07
>>> print(cp.collect())
   ID  POSITION  PROBABILITY
0   0        58     0.989308
1   1       249     0.991023
2   2       402     0.994154
3   3       539     0.981004
4   4       668     0.994708
Attributes
fit_hdbprocedure

Returns the generated hdbprocedure for fit.

predict_hdbprocedure

Returns the generated hdbprocedure for predict.

Methods

fit_predict(data[, key, endog, model])

Detects change-points of the input data.

get_stats()

Gets the statistics.

property fit_hdbprocedure

Returns the generated hdbprocedure for fit.

property predict_hdbprocedure

Returns the generated hdbprocedure for predict.

fit_predict(data, key=None, endog=None, model=None)

Detects change-points of the input data.

Parameters
dataDataFrame

Input time-series data for change-point detection.

keystr, optional

Column name for time-stamp of the input time-series data.

If the index column of data is not provided or not a single column, and the key of fit_predict function is not provided, the default value is the first column of data.

If the index of data is set as a single column, the default value of key is index column of data.

endogstr, optional

Column name for the value of the input time-series data. Defaults to the first non-key column.

modelDataFrame, optional

The model for change point detection.

Defaults to self.model_ (the default value of self.model_ is None).

Returns
DataFrame 1

Model.

DataFrame 2

The detected change points.

get_stats()

Gets the statistics.

Returns
DataFrame

Statistics.

Inherited Methods from PALBase

Besides those methods mentioned above, the OnlineBCPD class also inherits methods from PALBase class, please refer to PAL Base for more details.