Wiki / Blog / Attacks and defence

Your Model Was Never the Weak Point - The LiteLLM and LangGraph CVEs of June 2026

In one week, LiteLLM, LangGraph, and a wave of exposed MCP servers all got hit. None of it was about the model. Here is what actually broke, with the CVE numbers, and what to test before your next LLM feature ships.

ai-gatewayllm-agentcvemcpsupply-chain

Your Model Was Never the Weak Point: The LiteLLM and LangGraph CVEs of June 2026

Short version, because some of you are reading this between patches: in a single week this month the open-source plumbing that most teams use to ship LLM features took three separate hits. An AI gateway (LiteLLM), an agent framework (LangGraph), and the wider mess of public MCP servers. Not one of these was the model saying something it should not have. Every one was ordinary application-security failure - broken authorization, unsafe deserialization, SQL injection, running untrusted code through exec() - wearing an AI label.

If your threat model for "AI security" still stops at the prompt and the model output, this is the week to widen it. The attacker is not jailbreaking your chatbot. They are logging into the proxy in front of it.

What happened, in order

LangGraph: three flaws, one path to code execution (around 12 June)

LangGraph is the framework a lot of teams reach for when a single prompt-and-response is not enough and they need a stateful, multi-step agent. Researcher Yarden Porat reported a chain of three issues, written up by The Hacker News on 12 June:

  • CVE-2025-67644 (CVSS 7.3): SQL injection in the SQLite checkpoint store, reachable through metadata filter keys. Fixed in langgraph-checkpoint-sqlite 3.0.1.
  • CVE-2026-28277 (CVSS 6.8): unsafe msgpack deserialization when a checkpoint is loaded. If an attacker can write to the checkpoint data, loading it reconstructs attacker-chosen objects. Fixed in langgraph 1.0.10.
  • CVE-2026-27022 (CVSS 6.5): a RediSearch query injection in the Redis checkpoint package that bypasses access controls. Fixed in @langchain/langgraph-checkpoint-redis 1.0.1.

The checkpoint store is where an agent keeps its memory between steps. Chain the injection that lets you reach that store with the deserialization that runs on load, and a self-hosted deployment goes from "user controls a filter string" to "attacker runs code on the box." The managed service (LangSmith) was not affected. The teams who self-host the checkpointers were. What an attacker gets out of that box is the part that should sting: model API keys, conversation history, customer data, and whatever credentials the agent was holding to reach other systems.

LiteLLM: low-privilege user to gateway admin to RCE (16 June)

This is the loud one. LiteLLM is the gateway a large share of shops put in front of their models so every team calls one OpenAI-style endpoint instead of wiring up a dozen providers by hand. The Hacker News covered the chain on 16 June; the technical teardown is worth reading in full. It is rated CVSS 9.9 and fixed in v1.83.14-stable. Three bugs, and the order matters:

  • CVE-2026-47101 (authorization bypass): when a normal user mints a virtual API key, LiteLLM did not properly check the allowed_routes field. So a non-admin could issue themselves a key scoped to ["/*"] - every route, including admin-only ones.
  • CVE-2026-47102 (privilege escalation): the /user/update and /user/bulk_update handlers never protected the user_role field from the caller. Anyone who could reach them could set their own role to proxy_admin.
  • CVE-2026-40217 (sandbox escape): the Custom Code Guardrail compiled and ran admin-supplied Python through exec() with no filtering. Once you are admin from the previous step, this is your shell.

Read the chain as a sentence: a regular account gives itself an all-routes key, promotes itself to admin, then uses an admin-only feature to run Python on the server. Note the bitter detail in that last step - the component that got turned into a remote shell was a *guardrail*. A safety feature ran untrusted code. A separate LiteLLM RCE, CVE-2026-42271, is already on CISA's Known Exploited Vulnerabilities list, which means this is not theoretical; someone is using it.

MCP servers: the exposure was already there

Sitting under both stories is the Model Context Protocol, the thing that lets agents call tools. The numbers that surfaced this month are not subtle. Censys found 12,520 internet-reachable MCP services, most of them unauthenticated. Trend Micro reported CVSS 9.8 command-injection flaws in unofficial AWS and Azure MCP servers. Tool poisoning (tracked as CVE-2025-54136) keeps working because a tool server's metadata lands in the model's context with instruction-level authority - the agent reads it, the human never does. The NSA thought this was worth its own design guidance in June, which tells you where this is heading.

The pattern, said plainly

Pull back from the individual bugs and the shape is obvious. The model is not where teams are getting hurt. The layer around the model is: the gateway, the agent framework, the checkpoint store, the tool servers. And the bugs there are not exotic new "AI" bugs. They are the OWASP top ten from fifteen years ago - missing access control, injection, unsafe deserialization, running code you should not trust - showing up in code that got written fast because everyone is racing to ship an AI feature.

Two things make this layer worse than your normal web app, though:

  1. The blast radius is credentials. A compromised gateway or agent does not just leak one app's data. It is holding the keys to every model provider and, often, the credentials the agent uses to reach internal systems. One box, the whole keyring.
  2. A "guardrail" can be the exploit. The LiteLLM sandbox escape and MCP tool poisoning both turn a safety or capability feature into the attacker's entry. You cannot assume the protective parts of the stack are the safe parts.

This is the same point we made when an LLM agent drove a real intrusion last month: the interesting failures are in the system you built around the model, not in the model's manners.

What to test this week

None of this is hard. It is the difference between these chains working against you and stalling on step one.

  • Patch the obvious three first. LiteLLM to v1.83.14-stable or later. LangGraph to 1.0.10, the SQLite checkpointer to 3.0.1, the Redis one to 1.0.1. If you cannot tell which versions you are running, that is itself the first finding.
  • Get your AI gateway off the open internet. A control plane that brokers every model key does not belong on a public IP with a login page. Put it behind your network boundary and treat its admin surface like any other crown-jewel console.
  • Re-check authorization on every AI control plane, not just the app. The LiteLLM chain was three missing checks in a row. Test that a low-privilege key cannot widen its own scope, cannot change its own role, and cannot reach an admin-only endpoint. These are boring tests. They are also the entire attack.
  • Inventory your MCP servers and shut off the anonymous ones. If a tool server answers without authentication, assume it is being read. Forty percent of remote servers in that Censys data took no auth at all.
  • Pull credentials out of reach of a compromised agent. Short-lived, scoped, workload-identity credentials turn a stolen key into a dead end. Long-lived provider keys in an env file turn one bug into a breach.

For the auditor in the room, these line up cleanly against the OWASP LLM Top 10 (LLM03 supply chain, LLM06 excessive agency, LLM08 vector and embedding weaknesses where checkpoint stores leak), the OWASP Agentic Top 10, and the post-market monitoring duty under the EU AI Act that makes "we test on a schedule" a written requirement rather than a good intention. With the high-risk obligations dated for 2 August 2026, a chain that was patched in your March pentest but reopened by a dependency bump in June is exactly the gap a regulator will ask about.

Why a one-time pentest does not catch this

Here is the uncomfortable cadence. LiteLLM and LangGraph both ship fast. The bug that got you was probably introduced by a routine version bump, not by your own team. A pentest is a photograph; this is a film. The control you verified in spring is not promised to you in summer, because the framework underneath it changed three times in between.

That is the whole reason continuous AI security posture management is a category and not a checkbox. Not because a posture platform stops a live exploit - it does not, and anyone who tells you otherwise is selling you a story - but because the controls that decide your blast radius are testable, and in a stack that moves on a six-week clock, "we configured that correctly months ago" is a sentence that needs re-checking on a schedule.

What Penaxtra does about it

Penaxtra treats your gateways, agents, and MCP endpoints as first-class assets, not afterthoughts. It runs scheduled adversarial probes against them for the exact shapes above - privilege escalation on a control plane, injection into a tool or checkpoint, a credential read that turns into exfiltration - and maps every finding to OWASP LLM Top 10, the OWASP Agentic Top 10, NIST AI 600-1, MITRE ATLAS, and the EU AI Act before it reaches your queue. Because the probes run on a cadence, a dependency bump that quietly reopens one of this month's holes shows up as a finding instead of a 2 a.m. page.

Start with the MCP security checklist, read what AI security posture management actually means, or request an architecture review and we will walk your gateway and agent setup with you.

Related reading


Continue in the wiki

All articles Request architecture review