What does BERT stand for?

BERT stands for Bidirectional Encoder Representations from Transformers. A model architecture for text representation.

Machine Learning Natural Language Processing Large Language Models Acronyms Intermediate 1 min read

What is a BERT?

A model architecture for text representation.

Stands for: Bidirectional Encoder Representations from Transformers

BERT explained in plain English

A model architecture for text representation. A trained BERT model can act as part of a larger model for text classification or other ML tasks. BERT has the following characteristics: - Uses the Transformer architecture, and therefore relies on self-attention. - Uses the encoder part of the Transformer. The encoder's job is to produce good text representations, rather than to perform a specific task like classification. - Is bidirectional. - Uses masking for unsupervised training. BERT's variants include: - ALBERT, which is an acronym for A Light BERT. - LaBSE. See Open Sourcing BERT: State-of-the-Art Pre-training for Natural Language Processing for an overview of BERT.

Example

Practitioners refer to bert when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.

BERT explained in plain English

Example

People also read