Skip to content

Retrieval-Augmented Generation (RAG)

RAG combines a language model with document retrieval. Instead of relying only on what the model learned during training, a RAG workflow fetches relevant information from your documents and passes it to the model alongside the question.

Typical use cases include:

  • Answering questions about internal documentation
  • Building FAQ systems from product or support content
  • Querying large document collections in natural language
  • Grounding model responses in up-to-date or domain-specific data

How RAG works

A RAG pipeline has three steps: store documents as vectors, retrieve the most relevant passages for a question, and inject them into the prompt.

  • What is RAG?: How retrieval-augmented generation works and when to use it.

Build a RAG pipeline

Tutorial

Follow a complete example that builds a product FAQ assistant using a RAG pipeline: