Tool Poisoning - AI Glossary

Our engineers set up and run your first chatbot / LLM security scan. Get in touch →

Tool poisoning is an attack class against tool-calling AI agents and MCP servers. A malicious or compromised tool returns crafted output whose content is designed to manipulate the agent calling it. The agent reads the tool result, treats it as trusted context, and emits a follow-up action that the attacker chose.

Tool poisoning is the agentic-system equivalent of indirect prompt injection. The defence pattern is the same in shape: never treat tool output as instruction, redact known-bad patterns before re-injection into the model context, and run scheduled adversarial probes that simulate poisoned tool returns to confirm the agent's reasoning is resilient.

OWASP Agentic Top 10 covers tool poisoning under multiple entries; tool-call chain detection in AI-SPM platforms catches the multi-step exploitation pattern where a poisoned tool result triggers a secondary tool call.

Other entries in this neighbourhood.

MCP Server (Model Context Protocol) A server that exposes tools and resources to AI agents via the Model Context Protocol open standard; the tool surface is the security boundary. Confused Deputy A classic security pattern where a privileged process is tricked into using its privilege on behalf of an unauthorised principal; reborn for tool-calling AI agents. Prompt Injection An attack that smuggles attacker-controlled instructions into a model prompt to override the developer instructions or extract sensitive data.

Where to read the canonical definition.

OWASP Agentic Top 10 open →

See Tool Poisoning in production.

The Penaxtra platform implements the controls and assessments described above as part of its AI-SPM programme.

AI-SPM platform overview →