recall

Plain English Explanation

A metric for classification models that answers the following question: When ground truth was the positive class, what percentage of predictions did the model correctly identify as the positive class? Here is the formula: \[\text{Recall} = \frac{\text{true positives}} {\text{true positives} + \text{false negatives}} \] where: - true positive means the model correctly predicted the positive class. - false negative means that the model mistakenly predicted the negative class. For instance, suppose your model made 200 predictions on examples for which ground truth was the positive class. Of these 200 predictions: - 180 were true positives. - 20 were false negatives. In this case: \[\text{Recall} = \frac{\text{180}} {\text{180} + \text{20}} = 0.9 \]

Recall is particularly useful for determining the predictive power of classification models in which the positive class is rare. For example, consider a class-imbalanced dataset in which the positive class for a certain disease occurs in only 10 patients out of a million. Suppose your model makes five million predictions that yield the following outcomes: - 30 True Positives - 20 False Negatives - 4,999,000 True Negatives - 950 False Positives The recall of this model is therefore:

By contrast, the accuracy of this model is:

How is it used?

That high value of accuracy looks impressive but is essentially meaningless. Recall is a much more useful metric for class-imbalanced datasets than accuracy. --- See Classification: Accuracy, recall, precision and related metrics for more information.