fast_dtw
- hana_ml.algorithms.pal.tsa.fast_dtw.fast_dtw(data, radius, thread_ratio=None, distance_method=None, minkowski_power=None, save_alignment=None)
Dynamic time warping (DTW) calculates the distance or similarity between two time series. DTW stretches or compresses one or both of the two time series to make one match the other as much as possible. It also provides the optimal match between two given sequences with certain constraints and rules. Fast DTW is a twisted version of DTW to accelerate the computation when size of time series is huge. It recursively reduces the size of time series and calculate the DTW path on the reduced version, then refine the DTW path on the original ones. It may loss some accuracy of actual DTW distance in exchange of acceleration of computing.
- Parameters:
- dataDataFrame
Input data, expected to be structured as follows:
ID for multiple time series.
Timestamps.
Attributes of time series.
- radiusint
Used for balancing the accuracy and run time of DTW. Bigger value with more accuracy and slower run time. Must be positive.
- thread_ratiofloat, optional
Adjusts the percentage of available threads to use, from 0 to 1. A value of 0 indicates the use of a single thread, while 1 implies the use of all possible current threads. Values outside the range will be ignored and this function heuristically determines the number of threads to use.
Defaults to -1.
- distance_method{'manhattan', 'euclidean', 'minkowski', 'chebyshev', 'cosine'}, optional
Specifies the method to compute the distance between two points.
'manhattan': Manhattan distance
'euclidean': Euclidean distance
'minkowski': Minkowski distance
'chebyshev': Chebyshev distance
'cosine': Cosine distance
Defaults to 'euclidean'.
- minkowski_powerdouble, optional
Specifies the power of the Minkowski distance method.
Only valid when
distance_method
is 'minkowski'.Defaults to 3.
- save_alignmentbool, optional
Specifies if output alignment information. If True, output the table.
Defaults to False.
- Returns:
- DataFrames
- DataFrame 1Result, structured as follows:
LEFT_<ID column name of input table>: ID of one time series.
RIGHT_<ID column name of input table>: ID of the other time series.
DISTANCE: DTW distance of two time series.
- DataFrame 2Alignment table, structured as follows:
LEFT_<ID column name of input table>: ID of one time series.
RIGHT_<ID column name of input table>: ID of the other time series.
LEFT_INDEX: Corresponding to index of timestamps of time series with ID of 1st column.
RIGHT_INDEX : Corresponding to index of timestamps of time series with ID of 2nd column.
DataFrame 3 : Statistics.
Examples
>>> result, align, stats = fast_dtw(data=df, radius=5)