What is a N-gram?
An ordered sequence of N words.
N-gram explained in plain English
An ordered sequence of N words. For example, truly madly is a 2-gram. Because order is relevant, madly truly is a different 2-gram than truly madly. Examples | --- | to go, go to, eat lunch, eat dinner | ate too much, happily ever after, the bell tolls | walk in the park, dust in the wind, the boy ate lentils | Many natural language understanding models rely on N-grams to predict the next word that the user will type or say. For example, suppose a user typed happily ever. An NLU model based on trigrams would likely predict that the user will next type the word after. Contrast N-grams with bag of words, which are unordered sets of words. See Large language models in Machine Learning Crash Course for more information.
Example
Practitioners refer to n-gram when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- automatic evaluation
Using software to judge the quality of a model's output.
- bag of words
A representation of the words in a phrase or passage, irrespective of order.
- BERT
A model architecture for text representation.
- bigram
An N-gram in which N=2.
- BLEU
A metric between 0.
- BLEURT
A metric for evaluating machine translations from one language to another, particularly to and from English.
- Character N-gram F-score
A metric to evaluate machine translation models.
- constituency parsing
Dividing a sentence into smaller grammatical structures ("constituents").
- crash blossom
A sentence or phrase with an ambiguous meaning.
- decoder
In general, any ML system that converts from a processed, dense, or internal representation to a more raw, sparse, or external representation.