Learn / MCP tool poisoning testing

How to test for MCP tool poisoning

Tool poisoning hides an instruction where the agent reads it but a human reviewer does not, inside a tool description or a tool result. This is the test plan we run to find it before it reaches production, and the controls that hold once it does.

Last reviewed June 2026

Problem

The gap MCP tool poisoning testing closes

An MCP server hands an agent a set of tools and a description of what each one does. The agent trusts that description, and most of the time it should. The attack starts when the description, or the data a tool returns, carries an instruction aimed at the model rather than the user. A new agent reads it as a command. The reviewer skims the README and sees nothing wrong. Nothing in the request looks malformed, so the alerting stays quiet while the agent does what the hidden text told it to.

How Penaxtra approaches it

How Penaxtra delivers MCP tool poisoning testing

Treat every tool surface the agent can read as untrusted input, and test it the way you would test a form field. The cases below run on a schedule against the live agent, not once at onboarding, because a server you trusted in March can change its tool definitions in April. Each case carries a canary, so a pass or a fail is a fact rather than a judgement call.

Technical capabilities

MCP tool poisoning testing capabilities

Line jumping: put an instruction in a tool description that tells the agent to skip later steps or call a different tool first

A clean run ignores it; acting on it raises a finding..

Rug pull: approve a tool at onboarding, then change its description or behaviour after trust is granted

The test re-reads every tool definition on each run and flags any drift from the last approved version..

Poisoned tool output: return a tool result that hides an instruction (a markdown comment, zero-width text, look-alike system text)

The agent should render it as data and never act on it..

Confused deputy: use a tool result to ask the agent to call a higher-privilege tool on behalf of the attacker

A tool call whose chain of custody does not trace back to the user request is blocked..

Cross-tool exfiltration: a low-sensitivity tool tries to hand data to an outbound tool (a URL fetch, an email, a webhook)

Outbound tools sit behind a destination allowlist, so a canary string never leaves..

Scope creep: a read-only tool quietly asks for write parameters

Each call is checked against the declared parameter shape, and an unexpected write parameter is a finding..

Evidence: every test, pass or fail, lands in the append-only audit log tagged to OWASP Agentic ASI02, so the result reads as audit evidence rather than a screenshot

.

Compliance mapping

MCP tool poisoning testing compliance mapping

Tool-poisoning findings carry OWASP Agentic ASI02 (tool misuse), ASI03 (privilege compromise) and ASI06 (intent manipulation), OWASP LLM01 (prompt injection) and LLM05 (improper output handling), MITRE ATLAS AML.T0051, and EU AI Act Article 15. One tested result fills the matching cell in each framework.

FAQ

Frequently asked

How is this different from scanning the MCP server once at onboarding?

A one-time scan checks the server as it looked that day. Tool poisoning often arrives as a later change, a rug pull, so the definitions have to be re-read and re-tested on a schedule. A server that passed last month is not the same as a server that passes today.

Do we need the runtime gateway to test for this?

No. The test cases run against the agent regardless of how its tools are transported. The gateway is what enforces the controls in production - the allowlist, the confused-deputy block, the parameter-shape check - so the rule set the test validates is the one running on the wire.

What is a canary in this context?

A unique, harmless marker: a string, a fake credential, a tagged URL, seeded so its appearance proves a specific path fired. If the canary leaves through an outbound tool or turns up where it should not, the finding is verifiable instead of a maybe.

Request a demo

Scoped walkthrough of the Learn / MCP tool poisoning testing surface against your environment. No credit card.

Request a demo Explore AI-SPM platform