What is a retrieval-augmented generation?
A technique for improving the quality of large language model (LLM) output by grounding it with sources of knowledge retrieved after the model was trained.
retrieval-augmented generation explained in plain English
A technique for improving the quality of large language model (LLM) output by grounding it with sources of knowledge retrieved after the model was trained. RAG improves the accuracy of LLM responses by providing the trained LLM with access to information retrieved from trusted knowledge bases or documents. Common motivations to use retrieval-augmented generation include: - Increasing the factual accuracy of a model's generated responses. - Giving the model access to knowledge it was not trained on. - Changing the knowledge that the model uses. - Enabling the model to cite sources. For example, suppose that a chemistry app uses the PaLM API to generate summaries related to user queries. When the app's backend receives a query, the backend: 1. Searches for ("retrieves") data that's relevant to the user's query. 2. Appends ("augments") the relevant chemistry data to the user's query. 3. Instructs the LLM to create a summary based on the appended data.
Example
Practitioners refer to retrieval-augmented generation when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.
People also read
- sparse feature
A feature whose values are predominately zero or empty.
- average precision at k
A metric for summarizing a model's performance on a single prompt that generates ranked results, such as a numbered list of book recommendations.
- Backpropagation
The process that tells a neural network which internal settings caused an error and how to adjust them, working backwards through layers.
- BERT
A model architecture for text representation.
- Character N-gram F-score
A metric to evaluate machine translation models.
- citation precision
A metric that answers the following question: What percentage of the citations in an LLM's response were actually correct and supportive?
- citation recall
A metric that answers the following question: What percentage of the source documents the LLM used to compose its response are actually cited in the response?
- clipping
A technique for handling outliers by doing either or both of the following: - Reducing feature values that are greater than a maximum threshold down to that maximum threshold.
- cross-entropy
A generalization of Log Loss to multi-class classification problems.
- denoising
A common approach to self-supervised learning in which: 1.