DataTechNotes: Precision, Recall, Specificity, Prevalence, Kappa, F1-score check with R

Classification and regression models apply different methods to check the accuracy. In the previous post, we learned how to verify the regression model accuracy and related metrics. In this post, we’ll learn how to check classification model accuracy and its related metrics in R.
The basic concept of classification accuracy check is to identify the misclassification error rate in a prediction. There are several metrics to evaluate the classifier’s performance in its predictions. In this article, we’ll learn how to calculate the below accuracy metrics in R.

Accuracy
Precision
Recall (sensitivity)
Specificity
Prevalence
Kappa,
F1-score

First, we prepare the actual and predicted results by the model to check the model’s performance. We can use the below example to check the accuracy metrics.

actual = c("unknown", "target", "unknown","unknown","target","target","unknown",

           "target","target","target", "target")

predicted = c("unknown", "target", "target","unknown","target",

            "unknown", "unknown", "target", "target","target","unknown" )

As you may have noticed, there are 3 incorrect answers in prediction.
Next, we'll create a cross-table called confusion matrix based on the above data.

The confusion matrix is a table with columns containing actual classes and the rows with predicted classes, and it describes the classifier's performance against the known test data.

	Target-Positive	Unknown-Negative
Predicted target	5 (tp)	2 (fp)
Predicted unknown	1 (fn)	3 (tn)

True positive (TP) means that the label is 'target' and it is correctly predicted as 'target'.
True negative (TN) means that the label is not 'target' and it is correctly predicted as 'unknown'.
False-positive (FP) means that the label is not 'target' and it is wrongly predicted as a 'target'.
False-negative (FN) means that the label is 'target' and it is wrongly predicted as 'unknown'.

We'll get values from the matrix data

tp = 5
tn = 3
fp = 2
fn = 1

Now we can check the metrics and evaluate the model performance in R.

Accuracy

Accuracy represents the ratio of correct predictions. The sum of true positive and false negative is divided by the total number of events.

accuracy = function(tp, tn, fp, fn)
{
  correct = tp+tn
  total = tp+tn+fp+fn
  return(correct/total)
}

accuracy(tp, tn, fp, fn)
[1] 0.7272727

Precision

Precision identifies how accurately the model predicted the positive classes. The number of true positive events is divided by the sum of positive true and false events.

precision = function(tp, fp)

{
  return(tp/(tp+fp))
}

precision(tp, fp)
[1] 0.7142857

Recall or Sensitivity

Recall (sensitivity) measures the ratio of predicted the positive classes. The number of true positive events is divided by the sum of true positive and false negative events.

recall = function(tp, fn)
{
  return(tp/(tp+fn))
}

recall(tp, fn)
[1] 0.8333333

F1-Score

F1-score is the weighted average score of recall and precision. The value at 1 is the best performance and at 0 is the worst.

f1_score = function(tp, tn, fp, fn)
{
   p=precision(tp, fp)
   r=recall(tp, fn)
   return(2*p*r/(p+r))
}

f1_score(tp, tn, fp, fn)
[1] 0.7692308

Specificity or True Negative Rate

Specificity (true negative rate) measures the rate of actual negatives identified correctly.

specificity = function(tn,fp)
{
  return(tn/(tn+fp))
}

specificity(tn,fp)
[1] 0.6

Prevalence

Prevalence represents how often positive events occurred. The sum of true positive and false negative events are divided by the total number of events.

prevelence = function(tp,tn,fp,fn)
{
  t=tp+fn
  total=tp+tn+fp+fn
  return(t/total)
}

prevelence(tp,tn,fp,fn)
[1] 0.5454545

Kappa

Kappa (Cohen’s Kappa) identifies how well the model is predicting. The lower Kappa value is, the better the model is. First, we’ll count the results by category. Actual data contains 7 target and 4 unknown labels. Predicted data contains 6 target and 5 unknown labels.

length(actual[actual=="target"])
[1] 7
length(predicted[predicted=="target"])
[1] 6

total=tp+tn+fp+fn
observed_acc=(tp+tn)/total
expected_acc=((6*7/total)+(4*5/total))/total
 
Kappa = (observed_acc-expected_acc)/(1-expected_acc)
print(Kappa)
[1] 0.440678

Using confusionMatrix()

We can get all those metrics with one command in R. We load the 'caret' package and run the confusionMatrix() command with actual and predicted data.

library(caret)

confusionMatrix(as.factor(actual),as.factor(predicted))
Confusion Matrix and Statistics

          Reference
Prediction target unknown
   target       5       2
   unknown      1       3
                                          
               Accuracy : 0.7273          
                 95% CI : (0.3903, 0.9398)
    No Information Rate : 0.5455          
    P-Value [Acc > NIR] : 0.1829          
                                          
                  Kappa : 0.4407          
 Mcnemar's Test P-Value : 1.0000          
                                          
            Sensitivity : 0.8333          
            Specificity : 0.6000          
         Pos Pred Value : 0.7143          
         Neg Pred Value : 0.7500          
             Prevalence : 0.5455          
         Detection Rate : 0.4545          
   Detection Prevalence : 0.6364          
      Balanced Accuracy : 0.7167          
                                          
       'Positive' Class : target

The result shows the model's accuracy results.

In this post, we have briefly learned some of the accuracy metrics to evaluate the classification model. Thank you for reading!

Source code listing

library(caret)

actual = c("unknown", "target", "unknown","unknown","target","target","unknown",

           "target","target","target", "target")

predicted = c("unknown", "target", "target","unknown","target",

            "unknown", "unknown", "target", "target","target","unknown" )

tp = 5
tn = 3
fp = 2
fn = 1

accuracy = function(tp, tn, fp, fn)
{
  correct = tp+tn
  total = tp+tn+fp+fn
  return(correct/total)
}

precision = function(tp, fp)

{
  return(tp/(tp+fp))
}

recall = function(tp, fn)
{
  return(tp/(tp+fn))
}

f1_score = function(tp, tn, fp, fn)
{
   p=precision(tp, fp)
   r=recall(tp, fn)
   return(2*p*r/(p+r))
}

specificity = function(tn,fp)
{
  return(tn/(tn+fp))
}

prevelence = function(tp,tn,fp,fn)
{
  t=tp+fn
  total=tp+tn+fp+fn
  return(t/total)
}

length(actual[actual=="target"])
length(predicted[predicted=="target"])

total=tp+tn+fp+fn
observed_acc=(tp+tn)/total
expected_acc=((6*7/total)+(4*5/total))/total
 
Kappa = (observed_acc-expected_acc)/(1-expected_acc)
print(Kappa)

cm = confusionMatrix(as.factor(actual),as.factor(predicted))

print(cm)

DataTechNotes

Pages

Precision, Recall, Specificity, Prevalence, Kappa, F1-score check with R

No comments:

Post a Comment