fast_dtw
- hana_ml.algorithms.pal.tsa.fast_dtw.fast_dtw(data, radius, thread_ratio=None, distance_method=None, minkowski_power=None, save_alignment=None)
DTW is an abbreviation for Dynamic Time Warping. It is a method for calculating distance or similarity between two time series. fast DTW is a twisted version of DTW to accelerate the computation when size of time series is huge. It recursively reduces the size of time series and calculate the DTW path on the reduced version, then refine the DTW path on the original ones. It may loss some accuracy of actual DTW distance in exchange of acceleration of computing.
- Parameters
- dataDataFrame
Input data, expected to be structured as follows:
ID for multiple time series
Timestamps
Attributes of time series
- radiusint
Parameter used for fast DTW algorithm. It is for balancing DTW accuracy and runtime. The bigger, the more accuracy but slower. Must be positive.
- thread_ratiofloat, optional
Controls the proportion of available threads to use. The ratio of available threads.
0: single thread.
0~1: percentage.
Others: heuristically determined.
Defaults to -1.
- distance_method{'manhattan', 'euclidean', 'minkowski', 'chebyshev', 'cosine'}, optional
Specifies the method to compute the distance between two points.
'manhattan': Manhattan distance
'euclidean': Euclidean distance
'minkowski': Minkowski distance
'chebyshev': Chebyshev distance
'cosine': Cosine distance
Defaults to 'euclidean'.
- minkowski_powerdouble, optional
Specifies the power of the Minkowski distance method.
Only valid when
distance_method
is 'minkowski'.Defaults to 3.
- save_alignmentbool, optional
Specifies if output alignment information. If True, output the table.
Defaults to False.
- Returns
- DataFrame
- Result for fast dtw, structured as follows:
LEFT_<ID column name of input table>: ID of one time series.
RIGHT_<ID column name of input table>: ID of the other time series.
DISTANCE: DTW distance of two time series.
- Alignment table, structured as follows:
LEFT_<ID column name of input table>: ID of one time series.
RIGHT_<ID column name of input table>: ID of the other time series.
LEFT_INDEX: Corresponding to index of timestamps of time series with ID of 1st column.
RIGHT_INDEX : Corresponding to index of timestamps of time series with ID of 2nd column.
- Statistics for time series, structured as follows:
STAT_NAME: Statistics name.
STAT_VALUE: Statistics value.
Examples
>>> result, align, stats = fast_dtw(data, 5)