accuracy_score

hana_ml.algorithms.pal.metrics.accuracy_score(data, label_true, label_pred)

Compute mean accuracy score for classification results. That is, the proportion of the correctly predicted results among the total number of cases examined.

Parameters:

dataDataFrame: DataFrame of true and predicted labels.
label_truestr: Name of the column containing ground truth labels.
label_predstr: Name of the column containing predicted labels, as returned by a classifier.

Returns:

float: Accuracy classification score. A lower accuracy indicates that the classifier was able to predict less of the labels in the input correctly.

Examples

Actual and predicted labels df for a hypothetical classification:

>>> df.collect()
   ACTUAL  PREDICTED
  1        0
  0        0
  0        0
  1        1
  1        1

Accuracy score for these predictions:

>>> accuracy_score(data=df, label_true='ACTUAL', label_pred='PREDICTED')
0.8

Compare that to null accuracy df_dummy (accuracy that could be achieved by always predicting the most frequent class):

>>> df_dummy.collect()
   ACTUAL  PREDICTED
0    1       1
1    0       1
2    0       1
3    1       1
4    1       1
>>> accuracy_score(data=df_dummy, label_true='ACTUAL', label_pred='PREDICTED')
0.6

A perfect predictor df_perfect:

>>> df_perfect.collect()
   ACTUAL  PREDICTED
0    1       1
1    0       0
2    0       0
3    1       1
4    1       1
>>> accuracy_score(data=df_perfect, label_true='ACTUAL', label_pred='PREDICTED')
1.0