accuracy_score

hana_ml.algorithms.pal.metrics.accuracy_score(data, label_true, label_pred)

Compute mean accuracy score for classification results. That is, the proportion of the correctly predicted results among the total number of cases examined.

Parameters:

dataDataFrame: DataFrame of true and predicted labels.
label_truestr: Name of the column containing ground truth labels.
label_predstr: Name of the column containing predicted labels, as returned by a classifier.

Returns:

float: Accuracy classification score. A lower accuracy indicates that the classifier was able to predict less of the labels in the input correctly.

Examples

Actual and predicted labels df for a hypothetical classification:

>>> df.collect()
     ACTUAL   PREDICTED
       1           0
       0           0
       0           0
       1           1
       1           1

Accuracy score for these predictions:

>>> accuracy_score(data=df,
                   label_true='ACTUAL',
                   label_pred='PREDICTED')
0.8

Compare that to null accuracy df_dummy (accuracy that could be achieved by always predicting the most frequent class):

>>> df_dummy.collect()
   ACTUAL  PREDICTED
0       1          1
1       0          1
2       0          1
3       1          1
4       1          1
>>> accuracy_score(data=df_dummy,
                   label_true='ACTUAL',
                   label_pred='PREDICTED')
0.6

A perfect predictor df_perfect:

>>> df_perfect.collect()
   ACTUAL  PREDICTED
0       1          1
1       0          0
2       0          0
3       1          1
4       1          1
>>> accuracy_score(data=df_perfect,
                   label_true='ACTUAL',
                   label_pred='PREDICTED')
1.0