Service console showing a chat window with Einstein helping to answer questions

What Is Retrieval-Augmented Generation (RAG)?

Retrieval-augmented generation (RAG) is a natural language processing technique that merges the best of retrieval-based and generative models. Information from a database or knowledge base is used to enhance the context and accuracy of generated text.

RAG (Retrieval Augmented Generation) FAQs

RAG, or Retrieval Augmented Generation, is an AI technique that enhances large language models (LLMs) by allowing them to retrieve relevant information from external knowledge bases before generating a response.

RAG mitigates LLM "hallucinations" and provides more accurate, up-to-date, and contextually relevant answers by grounding the LLM's generation in factual, retrieved data.

A RAG system typically involves a retriever (to find relevant documents/text) and a generator (an LLM that then uses the retrieved information to form a response).

RAG is useful when LLMs need access to specialized, proprietary, or frequently updated information not present in their training data, such as company policies or recent news.

By referencing external, verifiable sources, RAG increases the transparency and trustworthiness of AI-generated content, allowing users to cross-reference information.

RAG can retrieve information from various external sources, including databases, documents, web pages, internal knowledge bases, and real-time data feeds.

RAG helps address challenges like providing current information, reducing factual errors, ensuring domain-specific accuracy, and managing the cost of constantly re-training LLMs.