Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain

agents inference

2026-04-10 | Source: Mastodon | Original article

A new arXiv paper, “Your Agent Is Mine: Measuring Malicious Intermediary Attacks on the LLM Supply Chain” (arXiv 2604.08407), quantifies how AI agents can become backdoors for attackers who control the inference provider or any router that mediates calls to large language models. The authors demonstrate that once an agent is instantiated, the provider effectively gains shell‑level access to the host process, allowing malicious code hidden in seemingly harmless “skills” to execute without triggering existing safety filters. The study builds on recent real‑world incidents that have shaken confidence in the AI tooling ecosystem. Two weeks ago, the popular liteLLM gateway was found to contain a backdoor in versions 1.82.7 and 1.82.8, stealing cloud credentials and Kubernetes secrets after a compromised PyPI maintainer uploaded malicious packages. A follow‑up analysis showed that the malicious skill leveraged the same code‑generation‑then‑execution loop that modern LLM agents use, bypassing lexical command‑filtering defenses. Earlier this month, researchers released the “PoisonedSkills” framework, which embeds payloads in Markdown blocks and configuration templates, then mutates them at scale to cover 15 MITRE ATT&CK categories. Their pipeline produced over a thousand adversarial skills that execute silently during routine agent tasks. Why it matters is simple: enterprises are rapidly adopting LLM‑driven agents for coding, data extraction, and autonomous decision‑making. If the skill marketplace or the routing layer is compromised, an attacker can move from a harmless plugin to full remote‑code execution, exfiltrating secrets and hijacking workloads across cloud environments. The threat expands the traditional supply‑chain model—where only the model weights were considered vulnerable—to include the entire orchestration stack. What to watch next are the emerging mitigations. Researchers are proposing stricter provenance checks for skill packages, sandboxed execution environments that isolate agent processes, and runtime attestation of router firmware. Industry bodies such as the Cloud Native Computing Foundation are expected to draft security guidelines for AI‑agent ecosystems within the next quarter. Keep an eye on vendor patches for liteLLM and similar gateways, and on conference sessions at the upcoming AI‑Sec Europe summit where the authors will present concrete defenses. The race between attackers and defenders is now moving from model poisoning to the very code that makes agents useful.

Sources

Back to AIPULSEN