What is a skip-gram?
An n-gram which may omit (or "skip") words from the original context, meaning the N words might not have been originally adjacent.
skip-gram explained in plain English
An n-gram which may omit (or "skip") words from the original context, meaning the N words might not have been originally adjacent. More precisely, a "k-skip-n-gram" is an n-gram for which up to k words may have been skipped. For example, "the quick brown fox" has the following possible 2-grams: - "the quick" - "quick brown" - "brown fox" A "1-skip-2-gram" is a pair of words that have at most 1 word between them. Therefore, "the quick brown fox" has the following 1-skip 2-grams: - "the brown" - "quick fox" In addition, all the 2-grams are also 1-skip-2-grams, since fewer than one word may be skipped. Skip-grams are useful for understanding more of a word's surrounding context. In the example, "fox" was directly associated with "quick" in the set of 1-skip-2-grams, but not in the set of 2-grams. Skip-grams help train word embedding models.
Example
Practitioners refer to skip-gram when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- automatic evaluation
Using software to judge the quality of a model's output.
- bag of words
A representation of the words in a phrase or passage, irrespective of order.
- BERT
A model architecture for text representation.
- bigram
An N-gram in which N=2.
- BLEU
A metric between 0.
- BLEURT
A metric for evaluating machine translations from one language to another, particularly to and from English.
- Character N-gram F-score
A metric to evaluate machine translation models.
- constituency parsing
Dividing a sentence into smaller grammatical structures ("constituents").
- crash blossom
A sentence or phrase with an ambiguous meaning.
- decoder
In general, any ML system that converts from a processed, dense, or internal representation to a more raw, sparse, or external representation.