AIExplainer

What is a retrieval-augmented generation?

A technique for improving the quality of large language model (LLM) output by grounding it with sources of knowledge retrieved after the model was trained.

A technique for improving the quality of large language model (LLM) output by grounding it with sources of knowledge retrieved after the model was trained. RAG improves the accuracy of LLM responses by providing the trained LLM with access to information retrieved from trusted knowledge bases or documents. Common motivations to use retrieval-augmented generation include: - Increasing the factual accuracy of a model's generated responses. - Giving the model access to knowledge it was not trained on. - Changing the knowledge that the model uses. - Enabling the model to cite sources. For example, suppose that a chemistry app uses the PaLM API to generate summaries related to user queries. When the app's backend receives a query, the backend: 1. Searches for ("retrieves") data that's relevant to the user's query. 2. Appends ("augments") the relevant chemistry data to the user's query. 3. Instructs the LLM to create a summary based on the appended data.

Practitioners refer to retrieval-augmented generation when building, training, or evaluating machine learning systems. It appears in research papers, product documentation, and technical discussions about AI capabilities and limitations.