Time series classification.

hanaml.ClassificationTS(
  data = NULL,
  label = NULL,
  key = NULL,
  classification.method = "LogisticRegression",
  transform.method = "MiniRocket",
  num.features = NULL,
  data.dim = NULL,
  random.seed = NULL
)

Arguments

data

DataFrame
Input data. When transform.method is "MiniRocket", for univariate time series, each row represents one time series. when transform.method is "MultiRocket", for multivariate time series, a fixed number of consecutive rows forms one time series, and that number is designated by the parameter data.dim when invoke hanaml.ClassificationTS().

label

DataFrame, optional
The label of time series. If classification.method is "LogisticRegression" and transform.method is "MiniRocket"/"MultiRocket", label is a mandatory parameter.

key

character, optional
Specifies the column name in data that represents the order of time-series.
Defaults to the first column of data.

transform.method

character, optional
The options are "MiniRocket" and "MultiRocket".
Defaults to "MiniRocket".

num.features

integer, optional
Number of transformed features for each time series.
Defaults to 9996 when method is "MiniRocket", 49728 when method is "MultiRocket".

data.dim

integer, optional
Dimensionality of the multivariate time series.
1 means univariate time series and others for multivariate. Cannot be smaller than 1.
Defaults to 1.

random.seed

integer, optional
0 indicates using machine time as seed.
Defaults to 0.

Value

A "ClassificationTS" object with the following attributes:

  • model : DataFrame
    Trained model content.

  • statistics : DataFrame
    Names and values of statistics.

Examples

Example 1: Univariate time series fitted and transformed by MiniRocket
Input dataframe is df:


> data$Collect()
    RECORD_ID  VAL_1  VAL_2  VAL_3  VAL_4  VAL_5  VAL_6  ...  VAL_10  VAL_11  VAL_12  VAL_13  VAL_14  VAL_15  VAL_16
0           0  1.598  1.599  1.571  1.550  1.507  1.434  ...   1.117   1.024   0.926   0.828   0.739   0.643   0.556
1           1  1.701  1.671  1.619  1.547  1.475  1.391  ...   1.070   0.985   0.899   0.816   0.733   0.658   0.581
2           2  1.722  1.695  1.657  1.606  1.512  1.414  ...   1.015   0.920   0.828   0.740   0.658   0.586   0.501
3           3  1.726  1.660  1.573  1.496  1.409  1.332  ...   0.987   0.901   0.815   0.730   0.644   0.558   0.484
4           4  1.779  1.761  1.703  1.611  1.492  1.369  ...   0.900   0.786   0.679   0.580   0.502   0.415   0.333
5           5  1.800  1.743  1.686  1.633  1.532  1.423  ...   0.979   0.872   0.767   0.664   0.561   0.453   0.355
6           6  1.749  1.727  1.659  1.560  1.457  1.355  ...   0.961   0.864   0.771   0.682   0.595   0.513   0.427
7           7  1.348  1.237  1.129  1.022  0.939  0.847  ...   0.474   0.388   0.306   0.218   0.133   0.061   0.009
8           8  1.696  1.634  1.596  1.507  1.414  1.323  ...   1.048   0.966   0.890   0.805   0.719   0.632   0.553
9           9  1.723  1.713  1.665  1.587  1.495  1.404  ...   1.041   0.955   0.870   0.787   0.706   0.622   0.547
10         10  1.614  1.574  1.557  1.521  1.460  1.406  ...   1.045   0.957   0.862   0.771   0.681   0.587   0.497
11         11  1.652  1.665  1.656  1.623  1.571  1.499  ...   1.155   1.058   0.973   0.877   0.797   0.704   0.609

The Dataframe of label of time series:


  > label.df.collect()
      DATA_ID LABEL
  0         0     A
  1         1     B
  2         2     C
  3         3     A
  4         4     B
  5         5     C
  6         6     A
  7         7     B
  8         8     C
  9         9     B
  10       10     C
  11       11     A

Invoke hanaml.ClassificationTS:


> tsc <- hanaml.ClassificationTS(classification.method = "LogisticRegression",
                                 transform.method = "MiniRocket",
                                 random.seed = 1)

Output:


> tsc$model$Collect()
> tsc$statistics$Collect()

Make a prediction:


> result <- predict(tsc, data=df)
> result$Collect()

Example 2: Multivariate time series (with dimensionality 8) fitted and transformed by MultiRocket
Input dataframe is df:


> df.collect()
      RECORD_ID  VAL_1  VAL_2  VAL_3  VAL_4  VAL_5  VAL_6  ...  VAL_10  VAL_11  VAL_12  VAL_13  VAL_14  VAL_15  VAL_16
  0           0  1.645  1.646  1.621  1.585  1.540  1.470  ...   1.161   1.070   0.980   0.893   0.798   0.705   0.620
  1           1  1.704  1.705  1.706  1.680  1.632  1.560  ...   1.186   1.090   0.994   0.895   0.799   0.702   0.605
  2           2  1.699  1.666  1.621  1.538  1.454  1.357  ...   0.979   0.885   0.793   0.706   0.623   0.541   0.460
  3           3  1.709  1.663  1.580  1.497  1.413  1.330  ...   0.997   0.913   0.831   0.748   0.665   0.582   0.509
  4           4  1.687  1.688  1.674  1.619  1.531  1.439  ...   1.069   0.977   0.900   0.810   0.722   0.644   0.557
  ......
  27         27  1.697  1.665  1.590  1.508  1.424  1.341  ...   1.009   0.926   0.844   0.760   0.678   0.595   0.513
  28         28  1.406  1.320  1.234  1.148  1.063  0.978  ...   0.642   0.558   0.477   0.396   0.314   0.234   0.153
  29         29  1.592  1.593  1.571  1.551  1.527  1.475  ...   1.160   1.058   0.956   0.859   0.763   0.668   0.574
  30         30  1.688  1.648  1.570  1.490  1.408  1.327  ...   1.011   0.930   0.849   0.768   0.687   0.606   0.524
  31         31  1.708  1.663  1.595  1.504  1.411  1.318  ...   0.951   0.861   0.794   0.704   0.614   0.529   0.446

The Dataframe of label of time series:


> label.df$Collect()
     DATA_ID LABEL
  0        0     A
  1        1     B
  2        2     C
  3        3     A

Invoke hanaml.ClassificationTS:


> tscm <- hanaml.ClassificationTS(classification.method = "LogisticRegression",
                                  transform.method = "MultiRocket",
                                  data.dim = 8,
                                  random.seed = 1)

Output:


 > tscm$model$Collect()
 > tscm$statistics$Collect()

Make a prediction:


 > result <- predict(tscm, data = df)
 > result$Collect()

See also