accuracy_score

hana_ml.algorithms.pal.metrics.accuracy_score(data, label_true, label_pred)

Compute mean accuracy score for classification results. That is, the proportion of the correctly predicted results among the total number of cases examined.

Parameters
dataDataFrame

DataFrame of true and predicted labels.

label_truestr

Name of the column containing ground truth labels.

label_predstr

Name of the column containing predicted labels, as returned by a classifier.

Returns
float

Accuracy classification score. A lower accuracy indicates that the classifier was able to predict less of the labels in the input correctly.

Examples

Actual and predicted labels df for a hypothetical classification:

>>> df.collect()
   ACTUAL  PREDICTED
0    1        0
1    0        0
2    0        0
3    1        1
4    1        1

Accuracy score for these predictions:

>>> accuracy_score(data=df, label_true='ACTUAL', label_pred='PREDICTED')
0.8

Compare that to null accuracy df_dummy (accuracy that could be achieved by always predicting the most frequent class):

>>> df_dummy.collect()
   ACTUAL  PREDICTED
0    1       1
1    0       1
2    0       1
3    1       1
4    1       1
>>> accuracy_score(data=df_dummy, label_true='ACTUAL', label_pred='PREDICTED')
0.6

A perfect predictor df_perfect:

>>> df_perfect.collect()
   ACTUAL  PREDICTED
0    1       1
1    0       0
2    0       0
3    1       1
4    1       1
>>> accuracy_score(data=df_perfect, label_true='ACTUAL', label_pred='PREDICTED')
1.0