What is a classification threshold?
In a binary classification, a number between 0 and 1 that converts the raw output of a logistic regression model into a prediction of either the positive class or the negative class.
classification threshold explained in plain English
In a binary classification, a number between 0 and 1 that converts the raw output of a logistic regression model into a prediction of either the positive class or the negative class. Note that the classification threshold is a value that a human chooses, not a value chosen by model training. A logistic regression model outputs a raw value between 0 and 1. Then: - If this raw value is greater than the classification threshold, then the positive class is predicted. - If this raw value is less than the classification threshold, then the negative class is predicted. For example, suppose the classification threshold is 0.8. If the raw value is 0.9, then the model predicts the positive class. If the raw value is 0.7, then the model predicts the negative class. The choice of classification threshold strongly influences the number of false positives and false negatives.
As models or datasets evolve, engineers sometimes also change the classification threshold. When the classification threshold changes, positive class predictions can suddenly become negative classes and vice-versa. For example, consider a binary classification disease prediction model. Suppose that when the system runs in the first year: - The raw value for a particular patient is 0.95. - The classification threshold is 0.94. Therefore, the system diagnoses the positive class. (The patient gasps, "Oh no! I'm sick!") A year later, perhaps the values now look as follows: - The raw value for the same patient remains at 0.95. - The classification threshold changes to 0.97. Therefore, the system now reclassifies that patient as the negative class. ("Happy day! I'm not sick.") Same patient. Different diagnosis. --- See Thresholds and the confusion matrix in Machine Learning Crash Course for more information.
Example
Practitioners refer to classification threshold when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- AUC
A number between 0.
- Backpropagation
The process that tells a neural network which internal settings caused an error and how to adjust them, working backwards through layers.
- Bayesian neural network
A probabilistic neural network that accounts for uncertainty in weights and outputs.
- Bayesian optimization
A probabilistic regression model technique for optimizing computationally expensive objective functions by instead optimizing a surrogate that quantifies the uncertainty using a Bayesian learning technique.
- configuration
The process of assigning the initial property values used to train a model, including: hyperparameters such as: - learning rate - iterations - optimizer - loss function In machine learning projects, c
- confusion matrix
An NxN table that summarizes the number of correct and incorrect predictions that a classification model made.
- cross-entropy
A generalization of Log Loss to multi-class classification problems.
- discriminative model
A model that predicts labels from a set of one or more features.
- embedding layer
A special hidden layer that trains on a high-dimensional categorical feature to gradually learn a lower dimension embedding vector.
- encoder
In general, any ML system that converts from a raw, sparse, or external representation into a more processed, denser, or more internal representation.