What is a sparse representation?
Storing only the position(s) of nonzero elements in a sparse feature.
sparse representation explained in plain English
Storing only the position(s) of nonzero elements in a sparse feature. For example, suppose a categorical feature named`species` identifies the 36 tree species in a particular forest. Further assume that each example identifies only a single species. You could use a one-hot vector to represent the tree species in each example. A one-hot vector would contain a single`1`(to represent the particular tree species in that example) and 35`0` s (to represent the 35 tree species not in that example). So, the one-hot representation of`maple` might look something like the following: Alternatively, sparse representation would simply identify the position of the particular species. If`maple` is at position 24, then the sparse representation of`maple` would simply be:
Notice that the sparse representation is much more compact than the one-hot representation.
Suppose each example in your model must represent the words—but not the order of those words—in an English sentence. English consists of about 170,000 words, so English is a categorical feature with about 170,000 elements. Most English sentences use an extremely tiny fraction of those 170,000 words, so the set of words in a single example is almost certainly going to be sparse data. Consider the following sentence:
Example
You could use a variant of one-hot vector to represent the words in this sentence. In this variant, multiple cells in the vector can contain a nonzero value. Furthermore, in this variant, a cell can contain an integer other than one. Although the words "my", "is", "a", and "great" appear only once in the sentence, the word "dog" appears twice. Using this variant of one-hot vectors to represent the words in this sentence yields the following 170,000-element vector: A sparse representation of the same sentence would simply be:
The term "sparse representation" confuses a lot of people because sparse representation is itself not a sparse vector. Rather, sparse representation is actually a dense representation of a sparse vector. The synonym index representation is a little clearer than "sparse representation." --- See Working with categorical data in Machine Learning Crash Course for more information.
People also read
- encoder
In general, any ML system that converts from a raw, sparse, or external representation into a more processed, denser, or more internal representation.
- language model
A model that estimates the probability of a token or sequence of tokens occurring in a longer sequence of tokens.
- sparse vector
A vector whose values are mostly zeroes.
- AUC
A number between 0.
- Backpropagation
The process that tells a neural network which internal settings caused an error and how to adjust them, working backwards through layers.
- bag of words
A representation of the words in a phrase or passage, irrespective of order.
- Bayesian neural network
A probabilistic neural network that accounts for uncertainty in weights and outputs.
- Bayesian optimization
A probabilistic regression model technique for optimizing computationally expensive objective functions by instead optimizing a surrogate that quantifies the uncertainty using a Bayesian learning technique.
- BERT
A model architecture for text representation.
- Character N-gram F-score
A metric to evaluate machine translation models.