What is a distribution?
The frequency and range of different values for a given feature or label.
distribution explained in plain English
The frequency and range of different values for a given feature or label. A distribution captures how likely a particular value is. The following image shows histograms of two different distributions: - On the left, a power law distribution of wealth versus the number of people possessing that wealth. - On the right, a normal distribution of height versus the number of people possessing that height. Understanding each feature and label's distribution can help you determine how to normalize values and detect outliers. The phrase out of distribution refers to a value that doesn't appear in the dataset or is very rare. For example, an image of the planet Saturn would be considered out of distribution for a dataset consisting of cat images.
Example
Practitioners refer to distribution when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- A/B testing
A statistical way of comparing two (or more) techniques—the A and the B.
- ablation
A technique for evaluating the importance of a feature or component by temporarily removing it from a model.
- accuracy
The number of correct classification predictions divided by the total number of predictions.
- activation function
A function that enables neural networks to learn nonlinear (complex) relationships between features and the label.
- active learning
A training approach in which the algorithm chooses some of the data it learns from.
- adaptation
Synonym for tuning or fine-tuning.
- agglomerative clustering
See hierarchical clustering.
- anomaly detection
The process of identifying outliers.
- area under the PR curve
See PR AUC (Area under the PR Curve).
- area under the ROC curve
See AUC (Area under the ROC curve).