What is a batch size?
The number of examples in a batch.
batch size explained in plain English
The number of examples in a batch. For instance, if the batch size is 100, then the model processes 100 examples per iteration. The following are popular batch size strategies: - Stochastic Gradient Descent (SGD), in which the batch size is 1. - Full batch, in which the batch size is the number of examples in the entire training set. For instance, if the training set contains a million examples, then the batch size would be a million examples. Full batch is usually an inefficient strategy. - mini-batch in which the batch size is usually between 10 and 1000. Mini-batch is usually the most efficient strategy. See the following for more information: - Production ML systems: Static versus dynamic inference in Machine Learning Crash Course. - Deep Learning Tuning Playbook.
Example
Practitioners refer to batch size when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- activation function
A function that enables neural networks to learn nonlinear (complex) relationships between features and the label.
- Backpropagation
The process that tells a neural network which internal settings caused an error and how to adjust them, working backwards through layers.
- batch
The set of examples used in one training iteration.
- batch normalization
Normalizing the input or output of the activation functions in a hidden layer.
- Bayesian neural network
A probabilistic neural network that accounts for uncertainty in weights and outputs.
- co-adaptation
An undesirable behavior in which neurons predict patterns in training data by relying almost exclusively on outputs of specific other neurons instead of relying on the network's behavior as a whole.
- convergence
A state reached when loss values change very little or not at all with each iteration.
- deep model
A neural network containing more than one hidden layer.
- depth
The sum of the following in a neural network: - the number of hidden layers - the number of output layers, which is typically 1 - the number of any embedding layers For example, a neural network with five hidden layers and one output layer has a depth of 6.
- dropout regularization
A form of regularization useful in training neural networks.