hanaml.confusion.matrix.RdCompute confusion matrix to evaluate the accuracy of a classification.
hanaml.confusion.matrix( data, key, label.true = NULL, label.pred = NULL, beta = NULL )
| data |
|
|---|---|
| key |
|
| label.true |
|
| label.pred |
|
| beta |
|
Returns a list of DataFrame:
DataFrame 1
Confusion matrix, structured as follows:
Original label: with same name and data type as it is in data.
Predicted label: with same name and data type as it is in data.
Count: type INTEGER, the number of data points with the corresponding combination of predicted and original label.
The DataFrame is sorted by (original label, predicted label) in descending order. NOTE: The data type of the original.label column and predict.label column must be the same.
DataFrame 2
Classfication report, structured as follows:
Class: type NVARCHAR(100), class name
Recall: type DOUBLE, the recall of each class
Precision: type DOUBLE, the precision of each class
F_MEASURE: type DOUBLE, the F_measure of each class
SUPPORT: type INTEGER, the support - sample number in each class
DataFrame df to calculate the confusion matrix:
> df$Collect()
ID ORIGINAL PREDICT
1 1 1 1
2 2 1 1
3 3 1 1
4 4 1 2
5 5 1 1
6 6 2 2
7 7 2 1
8 8 2 2
9 9 2 2
10 10 2 2
Calculate the confusion matrix:
> cm, cr <- hanaml.confusion.matrix(data = df,
key = "ID", label.true = "ORIGINAL",
label.pred = "PREDICT")
Return:
> cm$Collect() ORIGINAL PREDICT COUNT 1 1 1 4 2 1 2 1 3 2 1 1 4 2 2 4 > cr$Collect() CLASS RECALL PRECISION F_MEASURE SUPPORT 1 1 0.8 0.8 0.8 5 2 2 0.8 0.8 0.8 5