AIExplainer

stochastic gradient descent

A gradient descent algorithm in which the batch size is one.

A gradient descent algorithm in which the batch size is one. In other words, SGD trains on a single example chosen uniformly at random from a training set. See Linear regression: Hyperparameters in Machine Learning Crash Course for more information.

Practitioners refer to stochastic gradient descent when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.