AI Basics Machine Learning Deep Learning Mathematics Advanced

Gradient Descent

The method by which a model gradually improves by making small adjustments after each mistake, moving toward better performance.

Plain English Explanation

Gradient descent is the method by which a machine learning system gradually improves. It makes small adjustments after each mistake, slowly moving toward better performance — like finding the bottom of a valley by always walking downhill.

Variants of this algorithm underpin virtually all modern model training.

Analogy

Gradient descent is like descending a foggy hill by always feeling which direction slopes downward and taking one careful step at a time. You cannot see the bottom, but each step brings you closer.

How is it used?

Every major AI model — from image classifiers to ChatGPT — relies on gradient descent (or a variant) during training. It is the engine behind how models learn from errors.

Real-world Example

When engineers train a new LLM over weeks on thousands of GPUs, gradient descent is the optimisation process adjusting billions of parameters.

Common Misconceptions

Gradient descent does not guarantee the best possible solution — it finds a good local minimum, and learning rate choices matter enormously.