Retrieval-Augmented Generation (RAG) - AI Glossary

Retrieval-Augmented Generation (RAG) is the most common production pattern for grounding LLM responses in domain-specific knowledge. The user query is converted to an embedding, the embedding is matched against a vector store of pre-indexed document chunks, top matches are injected into the prompt as context, and the model generates a response that cites or paraphrases the retrieved chunks.

RAG introduces two new attack surfaces beyond the LLM itself: the retriever (vector store + similarity scoring + chunking pipeline) and the corpus (the documents being indexed). Corpus tainting attacks plant adversarial content in indexed documents so the retriever surfaces it when triggered. Cross-tenant retrieval errors leak data when a vector store hosts indexes for multiple customers or business units without strict namespace isolation.

RAG security testing covers thirteen automated patterns including canary tokens, cross-tenant probes, embedding-space adversarial inputs, and metadata filter integrity.

Other entries in this neighbourhood.

Vector Store A database optimised for similarity search over high-dimensional embedding vectors; the canonical storage layer for RAG. Embedding A dense numeric vector representation of text, image, or audio produced by an embedding model and used for similarity search and clustering. Prompt Injection An attack that smuggles attacker-controlled instructions into a model prompt to override the developer instructions or extract sensitive data. Adversarial Scan A scheduled execution of probe templates against an LLM endpoint, agent, or RAG pipeline, scored to produce control-mapped findings.

Where to read the canonical definition.

OWASP LLM Top 10 (RAG-relevant entries) open →

Other entries in this neighbourhood.

Where to read the canonical definition.

See Retrieval-Augmented Generation (RAG) in production.