Fundamentals

Retrieval-Augmented Generation (RAG)

Quick Answer

A technique that retrieves relevant documents and provides them to an LLM for grounded generation.

RAG combines document retrieval with language generation. When answering a question, the system first retrieves relevant documents (using semantic search), then provides them to the LLM as context. The LLM generates an answer grounded in the retrieved documents. RAG dramatically reduces hallucination, enables working with proprietary or real-time data, and improves factuality. It's the most practical approach for building knowledge-intensive applications. RAG quality depends on retrieval quality, document chunking, and prompt design. Most modern LLM applications use some form of RAG.

Last verified: 2026-04-08

Compare models

See how different LLMs compare on benchmarks, pricing, and speed.

Browse all models →