What is a pre-trained model?
Although this term could refer to any trained model or trained embedding vector, pre-trained model now typically refers to a trained large language model or other form of trained generative AI model.
pre-trained model explained in plain English
Although this term could refer to any trained model or trained embedding vector, pre-trained model now typically refers to a trained large language model or other form of trained generative AI model. See also base model and foundation model.
Example
Practitioners refer to pre-trained model when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- bag of words
A representation of the words in a phrase or passage, irrespective of order.
- bidirectional language model
A language model that determines the probability that a given token is present at a given location in an excerpt of text based on the preceding and following text.
- cross-entropy
A generalization of Log Loss to multi-class classification problems.
- dimension reduction
Decreasing the number of dimensions used to represent a particular feature in a feature vector, typically by converting to an embedding vector.
- dimensions
Overloaded term having any of the following definitions: The number of levels of coordinates in a Tensor.
- distillation
The process of reducing the size of one model (known as the teacher) into a smaller model (known as the student) that emulates the original model's predictions as faithfully as possible.
- embedding layer
A special hidden layer that trains on a high-dimensional categorical feature to gradually learn a lower dimension embedding vector.
- embedding space
The d-dimensional vector space that features from a higher-dimensional vector space are mapped to.
- embedding vector
Broadly speaking, an array of floating-point numbers taken from any hidden layer that describe the inputs to that hidden layer.
- encoder
In general, any ML system that converts from a raw, sparse, or external representation into a more processed, denser, or more internal representation.