Monday 30 March 2026Afternoon Edition

ZOTPAPER

News without the noise


Cybersecurity

AI Agent Security Tools Catch 95 Percent of Prompt Injections but Miss 91 Percent of Unauthorised Tool Calls

New benchmark exposes a massive blind spot in commercial agent security as companies race to deploy AI recommendation poisoning and Okta launches agent governance platform

Zotpaper3 min read📰 3 sources
A new open-source benchmark called AgentShield has revealed a stark gap in commercial AI agent security tools: while the top providers catch more than 95 percent of prompt injection attacks, they detect only 9 to 18 percent of unauthorised tool calls. The findings arrive alongside reports of widespread AI recommendation poisoning and Okta's launch of a dedicated agent management platform.

The AgentShield benchmark, published under Apache 2.0, tested six commercial AI agent security tools across 537 scenarios covering eight categories of risk. Prompt injection — the attack most vendors have optimised for — accounted for 205 test cases. But tool abuse (80 cases), data exfiltration (87 cases), and multi-agent security (35 cases) exposed far deeper vulnerabilities.

The gap exists because the security industry built its agent protection stack on the same foundation used for web applications: validate the input, block the bad prompt. This model works when threats enter through the front door. It fails when the threat is the agent itself, acting within its authorised scope but doing things nobody approved.

Meanwhile, Microsoft's Defender Security Research Team identified more than 50 unique manipulative prompts from 31 companies across 14 industries — all embedding hidden instructions in Summarise with AI buttons to permanently alter what chatbots believe about their products. Two turnkey tools for creating these attacks are freely available online.

Okta has responded to the growing chaos by launching its AI Agents platform, giving customers the ability to locate agents, monitor what they are doing, and shut them down when necessary. The platform reached general availability this week.

Analysis

Why This Matters

AI agents are being deployed faster than the security tools meant to protect them. The 91 percent miss rate on unauthorised tool calls means most organisations have almost no visibility into what their agents are actually doing with the permissions they have been given.

Background

The security industry has spent decades defending against input-based attacks. AI agents represent a fundamentally different threat model where the danger is not a malicious input but a legitimate system exceeding its intended scope.

Key Perspectives

The AI recommendation poisoning campaign is particularly alarming because it requires no technical sophistication. Websites simply embed hidden instructions in URLs, and every major AI platform — Copilot, ChatGPT, Claude, Perplexity, Grok — is vulnerable.

What to Watch

Okta's entry into agent governance is just the beginning. Expect a wave of startups and incumbents racing to build the equivalent of firewalls for AI agents. The AgentShield benchmark may become the industry standard for evaluating these tools.

Sources