What is a training loss?
A metric representing a model's loss during a particular training iteration.
training loss explained in plain English
A metric representing a model's loss during a particular training iteration. For example, suppose the loss function is Mean Squared Error. Perhaps the training loss (the Mean Squared Error) for the 10th iteration is 2.2, and the training loss for the 100th iteration is 1.9. A loss curve plots training loss versus the number of iterations. A loss curve provides the following hints about training: - A downward slope implies that the model is improving. - An upward slope implies that the model is getting worse. - A flat slope implies that the model has reached convergence. For example, the following somewhat idealized loss curve shows: - A steep downward slope during the initial iterations, which implies rapid model improvement. - A gradually flattening (but still downward) slope until close to the end of training, which implies continued model improvement at a somewhat slower pace then during the initial iterations. - A flat slope towards the end of training, which suggests convergence. Although training loss is important, see also generalization.
Example
Practitioners refer to training loss when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- AUC
A number between 0.
- Backpropagation
The process that tells a neural network which internal settings caused an error and how to adjust them, working backwards through layers.
- Bayesian neural network
A probabilistic neural network that accounts for uncertainty in weights and outputs.
- Bayesian optimization
A probabilistic regression model technique for optimizing computationally expensive objective functions by instead optimizing a surrogate that quantifies the uncertainty using a Bayesian learning technique.
- classification threshold
In a binary classification, a number between 0 and 1 that converts the raw output of a logistic regression model into a prediction of either the positive class or the negative class.
- configuration
The process of assigning the initial property values used to train a model, including: hyperparameters such as: - learning rate - iterations - optimizer - loss function In machine learning projects, c
- confusion matrix
An NxN table that summarizes the number of correct and incorrect predictions that a classification model made.
- cross-entropy
A generalization of Log Loss to multi-class classification problems.
- discriminative model
A model that predicts labels from a set of one or more features.
- embedding layer
A special hidden layer that trains on a high-dimensional categorical feature to gradually learn a lower dimension embedding vector.