multiclass_auc

hana_ml.algorithms.pal.metrics.multiclass_auc(data_original, data_predict)

Computes area under curve (AUC) to evaluate the performance of multi-class classification algorithms.

Parameters
data_originalDataFrame

True class data, structured as follows:

  • Data point ID column.

  • True class of the data point.

data_predictDataFrame

Predicted class data, structured as follows:

  • Data point ID column.

  • Possible class.

  • Classifier-computed probability that the data point belongs to that particular class.

For each data point ID, there should be one row for each possible class.

Returns
float

The area under the receiver operating characteristic curve.

DataFrame

False positive rate and true positive rate (ROC), structured as follows:

  • ID column, type INTEGER.

  • FPR, type DOUBLE, representing false positive rate.

  • TPR, type DOUBLE, representing true positive rate.

Examples

Input DataFrame df:

>>> df_original.collect()
   ID  ORIGINAL
0   1         1
1   2         1
2   3         1
3   4         2
4   5         2
5   6         2
6   7         3
7   8         3
8   9         3
9  10         3
>>> df_predict.collect()
    ID  PREDICT  PROB
0    1        1  0.90
1    1        2  0.05
2    1        3  0.05
3    2        1  0.80
4    2        2  0.05
5    2        3  0.15
6    3        1  0.80
7    3        2  0.10
8    3        3  0.10
9    4        1  0.10
10   4        2  0.80
11   4        3  0.10
12   5        1  0.20
13   5        2  0.70
14   5        3  0.10
15   6        1  0.05
16   6        2  0.90
17   6        3  0.05
18   7        1  0.10
19   7        2  0.10
20   7        3  0.80
21   8        1  0.00
22   8        2  0.00
23   8        3  1.00
24   9        1  0.20
25   9        2  0.10
26   9        3  0.70
27  10        1  0.20
28  10        2  0.20
29  10        3  0.60

Compute Area Under Curve:

>>> auc, roc = multiclass_auc(data_original=df_original, data_predict=df_predict)

Output:

>>> print(auc)
1.0
>>> roc.collect()
    ID   FPR  TPR
0    0  1.00  1.0
1    1  0.90  1.0
2    2  0.65  1.0
3    3  0.25  1.0
4    4  0.20  1.0
5    5  0.00  1.0
6    6  0.00  0.9
7    7  0.00  0.7
8    8  0.00  0.3
9    9  0.00  0.1
10  10  0.00  0.0