Prompt Injection - AI Glossary

Prompt injection is the OWASP LLM Top 10 number-one risk (LLM01). The attacker plants text that the model is likely to read (a document the user uploads, a webpage the agent crawls, a chat message in a multi-tenant transcript, the metadata of a retrieved RAG chunk) such that the language model treats the planted text as an instruction rather than as data.

Two common shapes: direct prompt injection where the attacker sends the prompt themselves through the chat interface; and indirect prompt injection where the attacker writes the prompt into a third-party surface (a document, a webpage, an email) that the application later reads. Indirect prompt injection is harder to defend against because the model sees the malicious content alongside legitimate data the user wants summarised.

Defences fall into three layers: input handling at the gateway (DLP and structural validation), model-side mitigations (instruction-tuning and system-prompt isolation), and detection through adversarial scanning that catches regressions when the model, the prompt template, or the surrounding orchestration changes.

Other entries in this neighbourhood.

Confused Deputy A classic security pattern where a privileged process is tricked into using its privilege on behalf of an unauthorised principal; reborn for tool-calling AI agents. Jailbreak A prompt designed to make an LLM produce output the developer instructions or safety alignment was meant to prevent. Adversarial Scan A scheduled execution of probe templates against an LLM endpoint, agent, or RAG pipeline, scored to produce control-mapped findings.

Where to read the canonical definition.

OWASP LLM01 Prompt Injection open →
MITRE ATLAS AML.T0051 open →

Other entries in this neighbourhood.

Where to read the canonical definition.

See Prompt Injection in production.