What is a label leakage?
A model design flaw in which a feature is a proxy for the label.
label leakage explained in plain English
A model design flaw in which a feature is a proxy for the label. For example, consider a binary classification model that predicts whether or not a prospective customer will purchase a particular product. Suppose that one of the features for the model is a Boolean named`SpokeToCustomerAgent`. Further suppose that a customer agent is only assigned after the prospective customer has actually purchased the product. During training, the model will quickly learn the association between`SpokeToCustomerAgent` and the label. See Monitoring pipelines in Machine Learning Crash Course for more information.
Example
Practitioners refer to label leakage when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- A/B testing
A statistical way of comparing two (or more) techniques—the A and the B.
- ablation
A technique for evaluating the importance of a feature or component by temporarily removing it from a model.
- accuracy
The number of correct classification predictions divided by the total number of predictions.
- act
A stage in the agentic loop in which the agent executes the action chosen during the reason stage.
- action
In reinforcement learning, the mechanism by which the agent transitions between states of the environment.
- action space
The set of resources an agent can use to perform a task.
- activation function
A function that enables neural networks to learn nonlinear (complex) relationships between features and the label.
- active learning
A training approach in which the algorithm chooses some of the data it learns from.
- adaptation
Synonym for tuning or fine-tuning.
- Agent
An AI system that can perceive its environment, make decisions, and take actions to achieve goals autonomously.