hanaml.CPD is a R wrapper
for SAP HANA PAL change-point detection algorithm.
hanaml.CPD(
data,
key,
features = NULL,
cost = NULL,
penalty = NULL,
custom.pen = NULL,
solver = NULL,
lambda = NULL,
min.size = NULL,
min.sep = NULL,
max.k = NULL,
dispersion = NULL,
lambda.range = NULL,
max.iter = NULL,
range.penalty = NULL
)
Arguments
| data |
DataFrame
Input data for change-point detection.
|
| key |
character, optional
Column name for time-stamp of the input time-series data.
|
| features |
character, optional
Column name(s) for the value(s) of the input time-series data.
|
| cost |
c("normal.mse", "normal.rbf", "normal.mhlb", "normal.mv", "linear", "gamma", "poisson",
"exponential", "normal.m", "negbinomial"), optional
The cost function for change-point detection.
Defaults to 'normal_mse'.
|
| penalty |
c("aic", "bic", "mbic", "oracle", "custom"), optional
The penalty function for change-point detection.
Defaults to
(1)"aic" if solver is "pruneddp", "pelt" or "opt",
(2)"custom" if solver is "adppelt".
|
| custom.pen |
numeric, optional
Specified the value of customized penalty.
Valid when penalty is "custom" or solver is "adppelt".
|
| solver |
("pelt", "opt", "adpelt", "pruneddp"), optional
Method for finding change-points of given data, cost and penalty.
Each solver supports different cost and penalty functions.
1 For cost functions, "pelt", "opt" and "adpelt" support the following eight:
"normal_mse", "normal_rbf", "normal_mhlb", "normal_mv",
"linear", "gamma", "poisson", "exponential";
while "pruneddp" supports the following four cost functions:
"poisson", "exponential", "normal_m", "negbinomial".
2 For penalty functions, "pruneddp" supports all penalties,
"pelt", "opt" and "adppelt" support the following three:
"aic","bic","custom", while "adppelt" only supports "custom" penalty.
Defaults to "pelt". |
| lambda |
double, optional
Assigned weight of the penalty w.r.t. the cost function, i.e. penalizaion factor.
It can be seen as trade-off between speed and accuracy of running the detection algorithm.
A small values (usually less than 0.1) will dramatically improve the efficiency.
Defaults to 0.02, and valid only when solver is "pelt" or "adppelt".
|
| min.size |
integer, optional
The minimal length from the very begining within which change would not happen.
Defaults to 2, valid only when solver is "opt", "pelt" or "adppelt".
|
| min.sep |
integer, optional
The minimal length of speration between consecutive change-points.
Defaults to 1, valid only when solver is "opt", "pelt" or "adppelt".
|
| max.k |
integer, optional
The maximum number of change-points to be detected. If the given value is less
than 1, this number would be determined automatically from the input data.
Defaults to 0, vaild only when solver is "pruneddp".
|
| dispersion |
double, optional
Dispersion coefficient for Gamma and negative binomial distribution.
Defaults to 1.0, valid only when cost is "gamma" or "negbinomial".
|
| lambda.range |
two numerical values, optional(deprecated)
Specifies the range for customized penalty, e.g.
Valid when solver is "adppelt" and custom.pen is not specified.
Deprecated, please use range.penalty instead. |
| max.iter |
integer, optional
Maximum number of iterations for searching the best penalty.
Valid only when solver is "adppelt".
Defaults to 40.
|
| range.penalty |
list/vector of two numerical values, optional
Specifies the range for customized penalty, e.g.
Valid when solver is "adppelt" and value.penalty is not specified.
Defaults to c(0.01, 100). |
Value
Returns a list of two DataFrame:
Details
Change-point detection (CPD) methods aim at detecting multiple abrupt changes such as change in mean,
variance or distribution in an observed time-series data.
Examples
Call the function:
> res <- hanaml.CPD(data = df,
solver ="pelt",
cost ="normal.mse",
penalty = "aic",
lambda = 0.02)