AIExplainer
Machine Learning Intermediate

distribution

The frequency and range of different values for a given feature or label.

The frequency and range of different values for a given feature or label. A distribution captures how likely a particular value is. The following image shows histograms of two different distributions: - On the left, a power law distribution of wealth versus the number of people possessing that wealth. - On the right, a normal distribution of height versus the number of people possessing that height. Understanding each feature and label's distribution can help you determine how to normalize values and detect outliers. The phrase out of distribution refers to a value that doesn't appear in the dataset or is very rare. For example, an image of the planet Saturn would be considered out of distribution for a dataset consisting of cat images.

Practitioners refer to distribution when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.