What is an auto-regressive model?
A model that infers a prediction based on its own previous predictions.
auto-regressive model explained in plain English
A model that infers a prediction based on its own previous predictions. For example, auto-regressive language models predict the next token based on the previously predicted tokens. All Transformer-based large language models are auto-regressive. In contrast, GAN-based image models are usually not auto-regressive since they generate an image in a single forward-pass and not iteratively in steps. However, certain image generation models are auto-regressive because they generate an image in steps.
Example
Practitioners refer to auto-regressive model when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- Attention
A mechanism that lets a model focus on the most relevant parts of its input when producing an output, weighting what matters most in context.
- autoencoder
A system that learns to extract the most important information from the input.
- depth
The sum of the following in a neural network: - the number of hidden layers - the number of output layers, which is typically 1 - the number of any embedding layers For example, a neural network with five hidden layers and one output layer has a depth of 6.
- embedding layer
A special hidden layer that trains on a high-dimensional categorical feature to gradually learn a lower dimension embedding vector.
- embedding vector
Broadly speaking, an array of floating-point numbers taken from any hidden layer that describe the inputs to that hidden layer.
- generative AI
An emerging transformative field with no formal definition.
- Long Short-Term Memory
A type of cell in a recurrent neural network used to process sequences of data in applications such as handwriting recognition, machine translation, and image captioning.
- mixture of experts
A scheme to increase neural network efficiency by using only a subset of its parameters (known as an expert) to process a given input token or example.
- Neural Architecture Search
A technique for automatically designing the architecture of a neural network.
- pooling
Reducing a matrix (or matrixes) created by an earlier convolutional layer to a smaller matrix.