What is a word embedding?
Representing each word in a word set within an embedding vector; that is, representing each word as a vector of floating-point values between 0.
word embedding explained in plain English
Representing each word in a word set within an embedding vector; that is, representing each word as a vector of floating-point values between 0.0 and 1.0. Words with similar meanings have more-similar representations than words with different meanings. For example, carrots, celery, and cucumbers would all have relatively similar representations, which would be very different from the representations of airplane, sunglasses, and toothpaste.
Example
Practitioners refer to word embedding when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- bag of words
A representation of the words in a phrase or passage, irrespective of order.
- encoder
In general, any ML system that converts from a raw, sparse, or external representation into a more processed, denser, or more internal representation.
- language model
A model that estimates the probability of a token or sequence of tokens occurring in a longer sequence of tokens.
- automatic evaluation
Using software to judge the quality of a model's output.
- BERT
A model architecture for text representation.
- bidirectional language model
A language model that determines the probability that a given token is present at a given location in an excerpt of text based on the preceding and following text.
- bigram
An N-gram in which N=2.
- BLEU
A metric between 0.
- BLEURT
A metric for evaluating machine translations from one language to another, particularly to and from English.
- Character N-gram F-score
A metric to evaluate machine translation models.