AI Basics Large Language Models Prompt Engineering Acronyms Intermediate

RAG

What does it stand for? Retrieval Augmented Generation

Pronunciation: /ræɡ/

A technique that combines AI language models with external knowledge retrieval for more accurate answers.

Plain English Explanation

Retrieval Augmented Generation (RAG) is a method that improves AI responses by first searching a knowledge base for relevant information, then feeding that information to a language model along with the user's question. Instead of relying solely on what the model learned during training, RAG lets it "look things up" before answering.\n\nThis approach reduces hallucinations, keeps answers current, and allows organisations to use proprietary documents without retraining the entire model.

Analogy

RAG is like giving a student an open-book exam instead of a closed-book one. The student (language model) still needs to understand and synthesise information, but they can consult reference materials (retrieved documents) to give better answers.

How is it used?

RAG is widely used in enterprise chatbots, customer support systems, internal knowledge bases, and any application where accurate, up-to-date information from specific documents is required.

Real-world Example

A company chatbot uses RAG to answer employee questions about HR policies. When asked "How many vacation days do I get?", the system searches the employee handbook, retrieves the relevant section, and the LLM generates a clear answer based on that specific document.

Common Misconceptions

RAG does not guarantee perfect accuracy — retrieval quality matters. Poor document chunking or irrelevant search results can still lead to incorrect answers.

History

RAG was introduced by Lewis et al. in 2020. It quickly became the standard approach for building production AI applications that need domain-specific knowledge.

Related Terms

LLM Embedding

References

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks