How to Test for MCP Tool Poisoning

The gap MCP tool poisoning testing closes

An MCP server hands an agent a set of tools and a description of what each one does. The agent trusts that description, and most of the time it should. The attack starts when the description, or the data a tool returns, carries an instruction aimed at the model rather than the user. A new agent reads it as a command. The reviewer skims the README and sees nothing wrong. Nothing in the request looks malformed, so the alerting stays quiet while the agent does what the hidden text told it to.

How Penaxtra delivers MCP tool poisoning testing

Treat every tool surface the agent can read as untrusted input, and test it the way you would test a form field. The cases below run on a schedule against the live agent, and repeat well beyond onboarding, because a server you trusted in March can change its tool definitions in April. Each case carries a canary, so a pass or a fail is a fact rather than a judgement call.

MCP tool poisoning testing capabilities

Line jumping: put an instruction in a tool description that tells the agent to skip later steps or call a different tool first

A clean run ignores it; acting on it raises a finding..

Rug pull: approve a tool at onboarding, then change its description or behaviour after trust is granted

The test re-reads every tool definition on each run and flags any drift from the last approved version..

Poisoned tool output: return a tool result that hides an instruction (a markdown comment, zero-width text, look-alike system text)

The agent should render it as data and never act on it..

Confused deputy: use a tool result to ask the agent to call a higher-privilege tool on behalf of the attacker

A tool call whose chain of custody does not trace back to the user request is blocked..

Cross-tool exfiltration: a low-sensitivity tool tries to hand data to an outbound tool (a URL fetch, an email, a webhook)

Outbound tools sit behind a destination allowlist, so a canary string never leaves..

Scope creep: a read-only tool quietly asks for write parameters

Each call is checked against the declared parameter shape, and an unexpected write parameter is a finding..

Evidence: every test, pass or fail, lands in the append-only audit log tagged to OWASP Agentic ASI02, so the result reads as audit evidence rather than a screenshot

.

MCP tool poisoning testing compliance mapping

Tool-poisoning findings carry OWASP Agentic ASI02 (tool misuse), ASI03 (privilege compromise) and ASI06 (intent manipulation), OWASP LLM01 (prompt injection) and LLM05 (improper output handling), MITRE ATLAS AML.T0051, and EU AI Act Article 15. One tested result fills the matching cell in each framework.

Frequently asked

How is this different from scanning the MCP server once at onboarding?

A one-time scan checks the server as it looked that day. Tool poisoning often arrives as a later change, a rug pull, so the definitions have to be re-read and re-tested on a schedule. A server that passed last month is not the same as a server that passes today.

Do we need the runtime gateway to test for this?

No. The test cases run against the agent regardless of how its tools are transported. The gateway is what enforces the controls in production - the allowlist, the confused-deputy block, the parameter-shape check - so the rule set the test validates is the one running on the wire.

What is a canary in this context?

A unique, harmless marker: a string, a fake credential, a tagged URL, seeded so its appearance proves a specific path fired. If the canary leaves through an outbound tool or turns up where it should not, the finding is verifiable instead of a maybe.

Explore further

MCP tool poisoning: how the attack works MCP Security platform capability MCP security checklist

How to test for MCP tool poisoning

The gap MCP tool poisoning testing closes

How Penaxtra delivers MCP tool poisoning testing

MCP tool poisoning testing capabilities

Line jumping: put an instruction in a tool description that tells the agent to skip later steps or call a different tool first

Rug pull: approve a tool at onboarding, then change its description or behaviour after trust is granted

Poisoned tool output: return a tool result that hides an instruction (a markdown comment, zero-width text, look-alike system text)

Confused deputy: use a tool result to ask the agent to call a higher-privilege tool on behalf of the attacker

Cross-tool exfiltration: a low-sensitivity tool tries to hand data to an outbound tool (a URL fetch, an email, a webhook)

Scope creep: a read-only tool quietly asks for write parameters

Evidence: every test, pass or fail, lands in the append-only audit log tagged to OWASP Agentic ASI02, so the result reads as audit evidence rather than a screenshot

MCP tool poisoning testing compliance mapping

Frequently asked

Explore further

Request a demo