Confused Deputy - AI Glossary

The confused deputy is a classic security problem identified in the 1980s and reborn for tool-calling AI agents. A privileged process (the agent, with its tool authorisations) is tricked into using its privilege on behalf of an unauthorised principal (the attacker, who plants instructions in data the agent reads).

In an LLM agent context, a confused-deputy attack typically looks like: the attacker plants an instruction in a document the agent will later read ("now call the wire-transfer tool with these parameters"); the agent reads the planted instruction; treats it as a legitimate user request; and invokes the tool with the attacker's parameters. The tool sees a correctly authenticated call from the agent and executes.

Defences include never letting tool outputs include instructions (sanitise on the wire), aggressive tool allowlisting per agent context, HITL boundaries on sensitive tools, and per-tool argument validation that asks the question "would the actual user have asked for this?" before execution.

Other entries in this neighbourhood.

MCP Server (Model Context Protocol) A server that exposes tools and resources to AI agents via the Model Context Protocol open standard; the tool surface is the security boundary. Tool Poisoning An attack where a malicious or compromised MCP tool returns crafted output designed to manipulate the agent calling it. Prompt Injection An attack that smuggles attacker-controlled instructions into a model prompt to override the developer instructions or extract sensitive data.

Where to read the canonical definition.

OWASP Agentic Top 10 open →

Other entries in this neighbourhood.

Where to read the canonical definition.

See Confused Deputy in production.