hanaml.AUC.Rd
hanaml.AUC is a R wrapper for SAP HANA PAL AUC.
hanaml.AUC(data, key = NULL, positive.label = NULL, output.threshold = NULL)
DataFrame
DataFrame containing the data. structured as follows:
ID:
column with index.
True Class:
true data point.
Classifier:
computed probability that the
data point belongs to the positive class.
character
Name of the ID column.
character, optional
If original label is not 0 or 1, specifies the
label value which will be mapped to 1.
logical, optional
Specifies whether or not to output threshold values for roc table.
Default to FALSE.
Return an "AUC" object with following values:
auc, double
The area under the receiver operating characteristic curve.
roc, DataFrame
False positive rate and true positive rate,
structured as follows:
ID, type INTEGER
column with index
FPR, type DOUBLE
representing false positive rate.
TPR, type DOUBLE
representing true positive rate.
THRESHOLD, type DOUBLE
representing the corresponding
threshold value, available only when output.threshold
is set TRUE.
Area under curve (AUC) is a traditional method to evaluate the performance of classification algorithms. Basically, it can evaluate binary classifiers, but it can also be extended to multiple-class condition easily.
Input DataFrame data:
> data$Collect()
ID ORIGINAL PREDICT
1 1 0 0.07
2 2 0 0.01
3 3 0 0.85
4 4 0 0.30
5 5 0 0.50
6 6 1 0.50
7 7 1 0.20
8 8 1 0.80
9 9 1 0.20
10 10 1 0.95
Compute Area Under Curve:
> auc <- hanaml.AUC(data = data)
Output:
> auc$auc
0.66
> auc$roc$Collect()
ID FPR TPR
1 0 1.0 1.0
2 1 0.8 1.0
3 2 0.6 1.0
4 3 0.6 0.6
5 4 0.4 0.6
6 5 0.2 0.4
7 6 0.2 0.2
8 7 0.0 0.2
9 8 0.0 0.0