What is a sparse feature?
A feature whose values are predominately zero or empty.
sparse feature explained in plain English
A feature whose values are predominately zero or empty. For example, a feature containing a single 1 value and a million 0 values is sparse. In contrast, a dense feature has values that are predominantly not zero or empty. In machine learning, a surprising number of features are sparse features. Categorical features are usually sparse features. For example, of the 300 possible tree species in a forest, a single example might identify just a maple tree. Or, of the millions of possible videos in a video library, a single example might identify just "Casablanca." In a model, you typically represent sparse features with one-hot encoding. If the one-hot encoding is big, you might put an embedding layer on top of the one-hot encoding for greater efficiency.
Example
Practitioners refer to sparse feature when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- BERT
A model architecture for text representation.
- Character N-gram F-score
A metric to evaluate machine translation models.
- Embedding
A numerical representation of text, images, or other data that captures semantic meaning.
- encoder
In general, any ML system that converts from a raw, sparse, or external representation into a more processed, denser, or more internal representation.
- language model
A model that estimates the probability of a token or sequence of tokens occurring in a longer sequence of tokens.
- retrieval-augmented generation
A technique for improving the quality of large language model (LLM) output by grounding it with sources of knowledge retrieved after the model was trained.
- rotational invariance
In an image classification problem, an algorithm's ability to successfully classify images even when the orientation of the image changes.
- ROUGE
A family of metrics that evaluate automatic summarization and machine translation models.
- ROUGE-L
A member of the ROUGE family focused on the length of the longest common subsequence in the reference text and generated text.
- ROUGE-N
A set of metrics within the ROUGE family that compares the shared N-grams of a certain size in the reference text and generated text.