accuracy_measure
- hana_ml.algorithms.pal.tsa.accuracy_measure.accuracy_measure(data, evaluation_metric=None, ignore_zero=None, alpha1=None, alpha2=None, massive=False, group_params=None)
Evaluates the forecast accuracy using measures such as:
'mpe': mean percentage error (MPE)
'mse': mean square error (MSE)
'rmse': root mean square error (RMSE)
'et': error total (ET)
'mad': mean absolute deviation (MAD)
'mase': out-of-sample mean absolute scaled error (MASE)
'wmape': weighted mean absolute percentage error (WMAPE)
'smape': symmetric mean absolute percentage error (SMAPE)
'mape': mean absolute percentage error (MAPE)
'spec': stock-keeping-oriented prediction error costs (SPEC)
- Parameters:
- dataDataFrame
Input data. In single mode:
If
data
contains 2 columns:1st column : actual data.
2nd column : forecasted data.
If
data
contains 3 columns:1st column : ID.
2nd column : actual data.
3rd column : forecasted data.
In massive mode (when
massive
is True):If
data
contains 3 columns:1st column : Group ID.
2nd column : actual data.
3rd column : forecasted data.
If
data
contains 4 columns1st column : Group ID.
2nd column : ID.
3rd column : actual data.
4th column : forecasted data.
- evaluation_metricstr or a list of str
Specifies the accuracy measure name(s), with valid options listed as follows:
'mpe': mean percentage error (MPE)
'mse': mean square error (MSE)
'rmse': root mean square error (RMSE)
'et': error total (ET)
'mad': mean absolute deviation (MAD)
'mase': out-of-sample mean absolute scaled error (MASE)
'wmape': weighted mean absolute percentage error (WMAPE)
'smape': symmetric mean absolute percentage error (SMAPE)
'mape': mean absolute percentage error (MAPE)
'spec': stock-keeping-oriented prediction error costs (SPEC)
Note
In single mode, if
evaluation_metric
is specified as 'spec' or contains 'spec' as one of its element, thendata
must have 3 columns (i.e. contain an ID column). In massive mode, similarly,data
must have 4 columns (i.e. contain a Group ID column and an ID column)- ignore_zerobool, optional
Specifies whether or not to ignore zero values in
data
when calculating MPE or MAPE. Valid only when 'mpe' or 'mape' is specified/included inevaluation_metric
.Defaults to False, i.e. use the zero values in
data
when calculating MPE or MAPE.- alpha1float, optional
Specifies unit opportunity cost parameter in SPEC measure, should be no less than 0. Valid only when 'spec' is specified/included in
evaluation_metric
.Defaults to 0.5.
- alpha2float, optional
Specifies the unit stock-keeping cost parameter in SPEC measure, should be no less than 0. Valid only when 'spec' is specified/included in
evaluation_metric
.Defaults to 0.5.
- massivebool, optional
Specifies whether or not to use massive mode.
True : massive mode.
False : single mode.
For parameter setting in massive mode, you could use both group_params (please see the example below) or the original parameters. Using original parameters will apply for all groups. However, if you define some parameters of a group, the value of all original parameter setting will be not applicable to such group.
An example is as follows:
In this example, as
alpha1
andevaluation_metricis
is set in group_params for Group_1,alpha2
andevaluation_metricis
not applicable to Group_1.Defaults to False.
- group_paramsdict, optional
If massive mode is activated (
massive
is True), input data for accuracy_measure shall be divided into different groups with different parameters applied. This parameter specifies the parameter values of different groups in a dict format, where keys corresponding to group ids while values should be a dict for parameter value assignments.An example is as follows:
Valid only when
massive
is True and defaults to None.
- Returns:
- DataFrame 1
Result of the forecast accuracy measurement, structured as follows:
STAT_NAME: Name of accuracy measures.
STAT_VALUE: Value of accuracy measures.
- DataFrame 2 (optional)
Error message. Only valid if
massive
is True.
Examples
Input data df:
>>> df.collect() ACTUAL FORECAST 0 1130.0 1270.0 1 2410.0 2340.0 ... 10 2345.0 2340.0 11 2650.0 2560.0
Perform accuracy measurement:
>>> res = accuracy_measure(data=df, evaluation_metric=['mse', 'rmse', 'mpe', 'et', 'mad', 'mase', 'wmape', 'smape', 'mape']) >>> res.collect() STAT_NAME STAT_VALUE 0 ET 412.000000 1 MAD 83.500000 ... 7 SMAPE 0.040876 8 WMAPE 0.037316