TimeSeriesClassification

class hana_ml.algorithms.pal.tsa.classification.TimeSeriesClassification(classification_method='LogisticRegression', transform_method='MiniRocket', **kwargs)

Time series classification.

Parameters:

classification_methodstr, optional

The options is "LogisticRegression".

Defaults to "LogisticRegression".

transform_methodstr, optional

The options are "MiniRocket" and "MultiRocket".

Defaults to "MiniRocket".

**kwargskeyword arguments

Arbitrary keyword arguments and please referred to the responding algorithm for the parameters' key-value pair.

For "MiniRocket"/ "MultiRocket":

num_features : int, optional

Number of transformed features for each time series.

Defaults to 9996 when transform_method is "MiniRocket", 49728 when transform_method is "MultiRocket".
data_dim : int, optional

Dimensionality of the multivariate time series.

1 means univariate time series and others for multivariate.
Cannot be smaller than 1.

Defaults to 1.
random_seed : int, optional

0 indicates using machine time as seed.

Defaults to 0.

Examples

Example 1: Univariate time series fitted and transformed by MiniRocket Input dataframe is df:

>>> df.collect()
    RECORD_ID  VAL_1  VAL_2  VAL_3  VAL_4  VAL_5  VAL_6  ...  VAL_10  VAL_11  VAL_12  VAL_13  VAL_14  VAL_15  VAL_16
         0  1.598  1.599  1.571  1.550  1.507  1.434  ...   1.117   1.024   0.926   0.828   0.739   0.643   0.556
         1  1.701  1.671  1.619  1.547  1.475  1.391  ...   1.070   0.985   0.899   0.816   0.733   0.658   0.581
         2  1.722  1.695  1.657  1.606  1.512  1.414  ...   1.015   0.920   0.828   0.740   0.658   0.586   0.501
         3  1.726  1.660  1.573  1.496  1.409  1.332  ...   0.987   0.901   0.815   0.730   0.644   0.558   0.484
         4  1.779  1.761  1.703  1.611  1.492  1.369  ...   0.900   0.786   0.679   0.580   0.502   0.415   0.333
         5  1.800  1.743  1.686  1.633  1.532  1.423  ...   0.979   0.872   0.767   0.664   0.561   0.453   0.355
         6  1.749  1.727  1.659  1.560  1.457  1.355  ...   0.961   0.864   0.771   0.682   0.595   0.513   0.427
         7  1.348  1.237  1.129  1.022  0.939  0.847  ...   0.474   0.388   0.306   0.218   0.133   0.061   0.009
         8  1.696  1.634  1.596  1.507  1.414  1.323  ...   1.048   0.966   0.890   0.805   0.719   0.632   0.553
         9  1.723  1.713  1.665  1.587  1.495  1.404  ...   1.041   0.955   0.870   0.787   0.706   0.622   0.547
       10  1.614  1.574  1.557  1.521  1.460  1.406  ...   1.045   0.957   0.862   0.771   0.681   0.587   0.497
       11  1.652  1.665  1.656  1.623  1.571  1.499  ...   1.155   1.058   0.973   0.877   0.797   0.704   0.609

The Dataframe of label of time series:

>>> label_df.collect()
    DATA_ID LABEL
       0     A
       1     B
       2     C
       3     A
       4     B
       5     C
       6     A
       7     B
       8     C
       9     B
     10     C
     11     A

Create an instance of TimeSeriesClassification:

>>> tsc = TimeSeriesClassification(classification_method="LogisticRegression",
                                   transform_method = "MiniRocket",
                                   random_seed=1)

Performing fit() on the given dataframe:

>>> tsc.fit(data=df, label=label_df)

Output:

>>> tsc.model_.collect()
      ID                                      MODEL_CONTENT
   -1                                         MiniRocket
    0  {"SERIES_LENGTH":16,"NUM_CHANNELS":1,"BIAS_SIZ...
    1  3005121315,1.685720499622002,2.819106917236017...
    2  00610192183,1.4931236298379538,-4.462113103585...
    3  9374860881,-6.2434692203217339,0.6595998500205...
..   ...                                                ...
116  0.0,-0.17856812090682684,0.05522285367663122,0...
117  0800557345022,0.3662488788249087,0.0,-0.062115...
118  41608,0.0,0.0,0.0,0.0,0.06350326307975242,0.77...
119  53600703,0.0,-0.7431206589182244,0.72227213245...
120  90682684,0.05522285367663122,0.0,0.0,0.0,0.0,0...
>>> tsc.statistics_.collect()
                   STAT_NAME   STAT_VALUE
MINIROCKET_TRANSFORM_TIME       0.010s
            TRAINING_TIME       0.043s
        TRAINING_ACCURACY            1
             TRAINING_OBJ  6.45594e-14
            TRAINING_ITER           56

Make a prediction:

>>> result = tsc.predict(data=df)
>>> result.collect()
    ID CLASS  PROBABILITY
  0     A          1.0
  1     B          1.0
  2     C          1.0
  3     A          1.0
  4     B          1.0
  5     C          1.0
  6     A          1.0
  7     B          1.0
  8     C          1.0
  9     B          1.0
10     C          1.0
11     A          1.0

Example 2: Multivariate time series (with dimensionality 8) fitted and transformed by MultiRocket Input dataframe is df:

>>> df.collect()
    RECORD_ID  VAL_1  VAL_2  VAL_3  VAL_4  VAL_5  VAL_6  ...  VAL_10  VAL_11  VAL_12  VAL_13  VAL_14  VAL_15  VAL_16
         0  1.645  1.646  1.621  1.585  1.540  1.470  ...   1.161   1.070   0.980   0.893   0.798   0.705   0.620
         1  1.704  1.705  1.706  1.680  1.632  1.560  ...   1.186   1.090   0.994   0.895   0.799   0.702   0.605
         2  1.699  1.666  1.621  1.538  1.454  1.357  ...   0.979   0.885   0.793   0.706   0.623   0.541   0.460
         3  1.709  1.663  1.580  1.497  1.413  1.330  ...   0.997   0.913   0.831   0.748   0.665   0.582   0.509
         4  1.687  1.688  1.674  1.619  1.531  1.439  ...   1.069   0.977   0.900   0.810   0.722   0.644   0.557
......
       27  1.697  1.665  1.590  1.508  1.424  1.341  ...   1.009   0.926   0.844   0.760   0.678   0.595   0.513
       28  1.406  1.320  1.234  1.148  1.063  0.978  ...   0.642   0.558   0.477   0.396   0.314   0.234   0.153
       29  1.592  1.593  1.571  1.551  1.527  1.475  ...   1.160   1.058   0.956   0.859   0.763   0.668   0.574
       30  1.688  1.648  1.570  1.490  1.408  1.327  ...   1.011   0.930   0.849   0.768   0.687   0.606   0.524
       31  1.708  1.663  1.595  1.504  1.411  1.318  ...   0.951   0.861   0.794   0.704   0.614   0.529   0.446

The Dataframe of label of time series:

>>> label_df.collect()
   DATA_ID LABEL
0        0     A
1        1     B
2        2     C
3        3     A

Create an instance of TimeSeriesClassification:

>>> tscm = TimeSeriesClassification(classification_method="LogisticRegression",
                                    transform_method = "MultiRocket",
                                    data_dim=8,
                                    random_seed=1)

Performing fit() on the given dataframe:

>>> tscm.fit(data=df, label=label_df)

Output:

>>> tscm.model_.collect()
      ID                                      MODEL_CONTENT
   -1                                        MultiRocket
    0  {"SERIES_LENGTH":16,"NUM_CHANNELS":8,"BIAS_SIZ...
    1  HANNELS":[6]},{"ID":77,"CHANNELS":[1,4,7,6,5]}...
    2  340878522815215,7.959895076819708,5.8147048859...
    3  944001223,-18.05915327183857,9.197905923784694...
..   ...                                                ...
532  .007475253911975343,0.0,-0.0004262211051778646...
533  4830497,-0.000050048412881394739,0.0,0.0000110...
534  -0.00244554242820342,0.0037744973091916455,0.0...
535  97,0.0,-0.00013759337618238812,0.0000837962572...
536  99,-0.000006703700582543153,0.0,-0.00132556403...

>>> tscm.statistics_.collect()
                    STAT_NAME   STAT_VALUE
MULTIROCKET_TRANSFORM_TIME       0.005s
             TRAINING_TIME       0.147s
         TRAINING_ACCURACY            1
              TRAINING_OBJ  7.96585e-14
             TRAINING_ITER           48

Make a prediction:

>>> result = tscm.predict(data=df)
>>> result.collect()
   ID CLASS  PROBABILITY
0   0     A          1.0
1   1     B          1.0
2   2     C          1.0
3   3     A          1.0

Attributes:

model_DataFrame: Trained model content.
statistics_DataFrame: Names and values of statistics.
forecast_DataFrame: Forecast values.

Methods

`fit`(data[, label, key, thread_ratio])	Trains a time series classification model with given time series and labels.
`predict`(data[, key, thread_ratio])	Predicts the classes of given time series.

fit(data, label=None, key=None, thread_ratio=None)

Trains a time series classification model with given time series and labels.

Parameters:

dataDataFrame

Input data. When transform_method="MiniRocket", for univariate time series, each row represents one time series. when transform_method="MultiRocket", for multivariate time series , a fixed number of consecutive rows forms one time series, and that number is designated by the parameter data_dim when initializing a TimeSeriesClassification instance.

labelDataFrame, optional

The label of time series. If classification_method is "LogisticRegression" and transform_method is "MiniRocket"/"MultiRocket", label is a mandatory parameter.

keystr, optional

The ID column.

Defaults to the first column of data if the index column of data is not provided. Otherwise, defaults to the index column of data.

thread_ratiofloat, optional

Controls the proportion of available threads to use. The ratio of available threads.

0: single thread.
0~1: percentage.
Others: heuristically determined.

Defaults to 1.0.

property fit_hdbprocedure: Returns the generated hdbprocedure for fit.

predict(data, key=None, thread_ratio=None)

Predicts the classes of given time series.

Parameters:

dataDataFrame

Input data.

For univariate time series, each row represents one time series, while for multivariate time series, a fixed number of consecutive rows forms one time series, and that number is designated by the parameter data_dim when initializing a TimeSeriesClassification instance.

keystr, optional

The ID column.

Defaults to the first column of data if the index column of data is not provided. Otherwise, defaults to the index column of data.

thread_ratiofloat, optional

Controls the proportion of available threads to use. The ratio of available threads.

0: single thread.
0~1: percentage.
Others: heuristically determined.

Defaults to 1.0.

Returns:

DataFrame: Prediction.

property predict_hdbprocedure: Returns the generated hdbprocedure for predict.

Inherited Methods from PALBase

Besides those methods mentioned above, the TimeSeriesClassification class also inherits methods from PALBase class, please refer to PAL Base for more details.