What is an outliers?
Values distant from most other values.
outliers explained in plain English
Values distant from most other values. In machine learning, any of the following are outliers: - Input data whose values are more than roughly 3 standard deviations from the mean. - Weights with high absolute values. - Predicted values relatively far away from the actual values. For example, suppose that`widget-price` is a feature of a certain model. Assume that the mean`widget-price` is 7 Euros with a standard deviation of 1 Euro. Examples containing a`widget-price` of 12 Euros or 2 Euros would therefore be considered outliers because each of those prices is five standard deviations from the mean. Outliers are often caused by typos or other input mistakes. In other cases, outliers aren't mistakes; after all, values five standard deviations away from the mean are rare but hardly impossible. Outliers often cause problems in model training. Clipping is one way of managing outliers. See Working with numerical data in Machine Learning Crash Course for more information.
Example
Practitioners refer to outliers when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- A/B testing
A statistical way of comparing two (or more) techniques—the A and the B.
- ablation
A technique for evaluating the importance of a feature or component by temporarily removing it from a model.
- accuracy
The number of correct classification predictions divided by the total number of predictions.
- activation function
A function that enables neural networks to learn nonlinear (complex) relationships between features and the label.
- active learning
A training approach in which the algorithm chooses some of the data it learns from.
- adaptation
Synonym for tuning or fine-tuning.
- agglomerative clustering
See hierarchical clustering.
- anomaly detection
The process of identifying outliers.
- area under the PR curve
See PR AUC (Area under the PR Curve).
- area under the ROC curve
See AUC (Area under the ROC curve).