Deep Learning Intermediate 1 min read

What is a vanishing gradient problem?

The tendency for the gradients of early hidden layers of some deep neural networks to become surprisingly flat (low).

vanishing gradient problem explained in plain English

The tendency for the gradients of early hidden layers of some deep neural networks to become surprisingly flat (low). Increasingly lower gradients result in increasingly smaller changes to the weights on nodes in a deep neural network, leading to little or no learning. Models suffering from the vanishing gradient problem become difficult or impossible to train. Long Short-Term Memory cells address this issue. Compare to exploding gradient problem.

Example

Practitioners refer to vanishing gradient problem when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.

vanishing gradient problem explained in plain English

Example

People also read