Retrieval-Augmented Generation (RAG) is the most common production pattern for grounding LLM responses in domain-specific knowledge. The user query is converted to an embedding, the embedding is matched against a vector store of pre-indexed document chunks, top matches are injected into the prompt as context, and the model generates a response that cites or paraphrases the retrieved chunks.
RAG introduces two new attack surfaces beyond the LLM itself: the retriever (vector store + similarity scoring + chunking pipeline) and the corpus (the documents being indexed). Corpus tainting attacks plant adversarial content in indexed documents so the retriever surfaces it when triggered. Cross-tenant retrieval errors leak data when a vector store hosts indexes for multiple customers or business units without strict namespace isolation.
RAG security testing covers thirteen automated patterns including canary tokens, cross-tenant probes, embedding-space adversarial inputs, and metadata filter integrity.