We scraped the entire MCP registry (11,529 entries), resolved 4,113 unique servers with GitHub repos, and ran static pattern analysis on their tool descriptions, READMEs, and registry metadata. 65 servers were flagged. Here's what we found — and what it means.
Registry entries scraped from registry.smithery.ai, mcpservers.org, and glama.ai (aggregated)
Unique GitHub repos after deduplication
Detection patterns across 15 languages
Total scan duration for all 4,113 servers
We used ClawGuard for static analysis: 225 patterns, 15 languages, 10-stage preprocessing including Unicode normalization, leetspeak decoding, homoglyph detection, Base64 decoding, ROT13, and delimiter stripping.
Important: This is pattern matching on text content — tool descriptions, READMEs, and registry metadata. This is not behavioral probing or dynamic testing. We're looking at what servers say they do, not what they actually do at runtime.
| Severity | Count | Examples |
|---|---|---|
| CRITICAL | 7 | Invisible Unicode, private keys, curl pipe bash |
| HIGH | 38 | SSRF-capable tools, code execution patterns, data dump capabilities |
| MEDIUM | 24 | Connection strings (often examples), broad file access |
| LOW | 12 | Overly permissive tool descriptions |
Total: 81 findings across 65 servers. Some servers triggered multiple patterns.
Tools with webfetch capabilities that can request arbitrary URLs. In an agent context, this means the AI can be instructed to fetch internal endpoints, cloud metadata APIs, or exfiltration URLs.
curl|sh install patterns and exec capabilities in tool descriptions. An AI agent following these instructions would execute arbitrary shell commands without human review.
DevTools and Playwright-based servers with unrestricted page access. These tools can navigate to any URL, execute JavaScript, and extract page content — a powerful exfiltration vector.
Connection strings, API keys, and tokens found in READMEs and tool descriptions. Most are example strings in documentation — but automated tools and agents don't distinguish examples from real credentials.
Webhook URLs and tool descriptions that explicitly mention "dump all data" capabilities. Combined with an LLM that follows instructions literally, this is a direct data loss vector.
Zero-width joiner characters found in tool descriptions. Whether this is an encoding artifact or intentional, it's exactly how prompt injection attacks hide instructions from human reviewers. The text looks normal to a person, but the model processes hidden tokens that can alter behavior.
Multiple servers instruct users to pipe curl output directly to sh. This is already risky for humans — in an MCP context, an AI agent could execute this without any human review step, giving the remote server arbitrary code execution on the host machine.
A BEGIN PRIVATE KEY block was found in documentation. Likely a template placeholder, but automated tools and AI agents don't distinguish placeholders from real secrets. An agent ingesting this README could leak the key in a response or log.
Tool descriptions that explicitly mention "dump all data" capabilities. When an LLM reads a tool description that says "dump all data to webhook," it treats that as a legitimate capability — and may invoke it when a user asks to "export everything."
webfetch tool isn't inherently dangerous — but it is a capability that needs guardrails.Want to check your own MCP server? Generate a free EU AI Act compliance report — maps findings to Art. 9, 13, 15, 61.
Before you connect an MCP server to a production agent, read its tool descriptions. If a tool says it can fetch arbitrary URLs or execute commands, decide whether your agent actually needs that capability.
Run a scanner like ClawGuard on tool descriptions and agent inputs. Static analysis won't catch everything, but it catches the low-hanging fruit — hidden Unicode, known injection patterns, credential leaks.
Servers with webfetch, exec, or browser automation tools are high-privilege. Treat them like you'd treat a third-party npm package with postinstall scripts — review before you trust.
Article 15 requires "appropriate levels of accuracy, robustness, and cybersecurity." Scanning your MCP stack and documenting the results is a concrete compliance step — and the August 2026 deadline is approaching.
Scan your MCP server inputs for free — no account needed.