Jörg Michno Shield Blog Audit Registry GitHub

← Back to Blog

The Shared Responsibility Model for MCP Security: Why “Not My Problem” Doesn’t Work

March 26, 2026 · Joerg Michno · 12 min read

We filed 32 security advisories to MCP server maintainers over the past month. The projects span 280,000+ GitHub stars collectively — from database connectors to browser automation tools to code generation servers. Every advisory included a proof-of-concept, a clear attack scenario, and suggested mitigations.

The response rate was encouraging. Most maintainers engaged constructively. But a recurring pattern emerged in a subset of responses: deflection. Not malicious, not lazy — just a genuine belief that the security issue belonged to someone else's layer.

This is the same problem cloud computing solved a decade ago with the shared responsibility model. AWS doesn't tell customers "that's your problem" when they get hacked, and customers don't tell AWS "you should have prevented it." Instead, both parties understand exactly which security obligations fall on which layer. The MCP ecosystem needs the same clarity.

MCP Ecosystem — March 2026

MCP Servers Indexed 11,500+

Advisories Filed 32

Vulnerability Rate 7.2% (Queen's U study)

Avg. Response Time 3–7 days

The 4 Pushback Patterns
1. "The Attacker Already Has Access"
2. "It's Standard Industry Practice"
3. "It's the LLM's Responsibility"
4. "Already Known / Duplicate"
The MCP Shared Responsibility Model
Why Each Layer Matters
OWASP Alignment
What MCP Server Developers Should Do

The 4 Pushback Patterns

These patterns are composites drawn from multiple advisory responses. We've anonymized the specific maintainers because the goal is to analyze the arguments, not to call out individuals. Every maintainer we interacted with was acting in good faith.

#1 — “The Attacker Already Has Access”

“If the attacker has local access to the machine, they can already do anything. The MCP server doesn't expand the attack surface. The LLM layer should handle this. Credentials aren't exposed. Client confirmations exist.”

This argument sounds reasonable until you examine what "local access" actually means in the MCP context. A user running an LLM client with an MCP server connected has authorized the LLM to call tools. That is not the same as giving an attacker unrestricted shell access to the machine.

Consider the attack chain: A prompt injection arrives via a third-party data source — a web page, a document, a database record. The LLM processes it. The injected instruction calls an MCP tool. The MCP server executes it with whatever permissions it holds.

The user didn't "give the attacker access." The user gave the LLM access, and the attacker hijacked the LLM's agency through indirect prompt injection. This is precisely the scenario described in OWASP LLM Top 10 — LLM04: Insecure Output Handling.

The defense-in-depth counter: Each layer should minimize the damage an attacker can cause if the layer above is compromised. An MCP server that validates inputs, enforces least privilege, and sanitizes outputs limits what a hijacked LLM can do — even if the attacker "already has access" to the LLM session.

#2 — “It’s Standard Industry Practice”

“This installation method is standard industry practice. curl|bash is how tools are installed. This is not a security issue. Closing as invalid.”

The prevalence of a practice does not make it secure. curl|bash downloads a script over HTTPS and pipes it directly into a shell. If the server is compromised, the CDN is poisoned, or a DNS hijack occurs, the user executes arbitrary code with no verification step.

Supply chain attacks are not theoretical. The SolarWinds attack compromised 18,000 organizations through a trusted update channel. The xz Utils backdoor (CVE-2024-3094) was embedded in a widely-used compression library by a long-term contributor. The npm event-stream incident injected a cryptocurrency-stealing payload into a package with 8 million weekly downloads.

Every one of these attacks exploited "standard practice" — trusting the supply chain without verification.

The supply chain counter: Signed packages, checksum verification, and reproducible builds exist for a reason. MCP servers that distribute installation via curl|bash should at minimum provide a checksum for verification. The OWASP MCP Top 10 includes MCP09: Inadequate Sandboxing and supply chain risks for exactly this scenario.

#3 — “It’s the LLM’s Responsibility”

“This is no longer an issue with this MCP server. It's a problem wherever LLM input is involved. The LLM should filter its own inputs.”

This is the most common deflection pattern, and the most dangerous one. It assumes that one layer — the LLM — can solve the entire security problem alone. It cannot.

LLMs are fundamentally incapable of perfectly distinguishing instructions from data. This is the core challenge described in OWASP LLM01: Prompt Injection. The instruction/data boundary in natural language is ambiguous by design. Even the best models with the strongest system prompts can be manipulated through multi-step, context-aware attacks.

Saying "the LLM should handle it" is equivalent to saying "the firewall should catch everything, so we don't need input validation in the application." No security architect would accept that argument for web applications. The same principle applies here.

The layered defense counter: Input validation at the MCP server layer catches attacks before they reach the LLM. Output sanitization at the MCP layer prevents the LLM from receiving poisoned tool results. Each layer does what it can, and the cumulative effect is a significantly harder target. The MCP server is the last line of defense before a tool executes a real-world action.

#4 — “Already Known / Duplicate”

“This is a duplicate of an existing issue. Please search existing issues before filing. Closing.”

This response is sometimes entirely appropriate. If a maintainer has already acknowledged the vulnerability and is tracking it, pointing to the existing issue is the right thing to do.

However, we observed cases where "duplicate" was used to close reports without linking to the original issue, without confirming the vulnerability was being tracked, and without engaging with the specific attack scenario presented. In those cases, "duplicate" functions as a dismissal mechanism rather than a triage decision.

The good-faith counter: If an advisory is truly a duplicate, link to the tracking issue. Confirm the severity assessment. Acknowledge the reporter's effort. The OWASP Vulnerability Disclosure Cheat Sheet recommends exactly this: even duplicate reports deserve a substantive response.

The MCP Shared Responsibility Model

Cloud computing solved the "not my problem" deadlock by defining who is responsible for what. AWS secures the infrastructure; the customer secures their data and configurations. Both sides know their obligations. The MCP ecosystem needs the same framework.

USER LAYER

Awareness · Trusted sources only · Review tool permissions · Monitor LLM actions

CLIENT / HOST LAYER

User confirmation prompts · Sandboxing · Permission boundaries · Rate limiting · Audit logging

MCP SERVER LAYER

Input validation · Output sanitization · Least privilege · Tool description integrity · No credential leakage

MODEL PROVIDER LAYER

Instruction/data separation · System prompt isolation · Injection resistance training · Safety filters

Data flows from the user through the client, into the LLM, out to the MCP server, and back. Every layer the data touches is a layer that can — and should — apply security controls.

Why Each Layer Matters

Model Provider Layer

The model provider's job is to make the LLM as resistant to prompt injection as possible. This includes training for instruction/data separation, isolating system prompts from user input, and providing safety filters. But even the best models cannot guarantee perfect separation — this is an open research problem. The model layer reduces the probability of a successful injection. It does not eliminate it.

MCP Server Layer

This is where most of the pushback occurs — and where the most practical mitigations are available. MCP servers execute real-world actions: reading files, querying databases, sending requests, modifying configurations. The server layer should:

Validate inputs — reject tool arguments that contain injection patterns, unexpected formats, or out-of-scope values
Sanitize outputs — strip or escape content that could be interpreted as instructions when returned to the LLM
Enforce least privilege — request only the permissions necessary for the tool's stated function
Protect tool descriptions — ensure tool metadata cannot be poisoned to mislead the LLM (OWASP MCP01: Tool Poisoning)

Client / Host Layer

The LLM client (Claude Desktop, Cursor, a custom agent) controls what the LLM can access and what requires user confirmation. This layer should enforce permission boundaries, require explicit user approval for sensitive operations, sandbox tool execution, and maintain audit logs. OWASP LLM07: Excessive Agency specifically addresses the risks of clients that grant too much autonomy.

User Layer

Users are the final gatekeepers. They choose which MCP servers to install, which permissions to grant, and whether to approve tool calls. User awareness is not a substitute for technical controls — but it is the last line of defense when all technical layers fail. Users should install MCP servers only from trusted sources, review the permissions each server requests, and monitor what tools the LLM is actually calling.

OWASP Alignment

The shared responsibility model maps directly to the OWASP Top 10 for LLM Applications and the emerging OWASP MCP Top 10:

OWASP Item	Responsible Layer(s)	Pushback Pattern
LLM01: Prompt Injection	Model + MCP Server + Client	#3 "LLM's responsibility"
LLM04: Insecure Output Handling	MCP Server + Client	#1 "Attacker already has access"
LLM07: Excessive Agency	Client + MCP Server	#1 "Attacker already has access"
MCP01: Tool Poisoning	MCP Server	#3 "LLM's responsibility"
MCP09: Inadequate Sandboxing	Client + MCP Server	#2 "Standard practice"

Every OWASP item involves multiple layers. No single layer owns the entire problem. The maintainers who deflect to another layer aren't wrong that the other layer has responsibilities — they're wrong that this absolves their own layer.

What MCP Server Developers Should Do

If you maintain an MCP server, here are concrete steps to fulfill your layer's responsibilities:

Validate tool inputs. Check argument types, ranges, and formats. Reject inputs containing injection patterns. A regex-based scan adds less than 10ms of latency and catches the most common attacks.
Sanitize tool outputs. If your server returns data from external sources (databases, web pages, files), strip or escape content that could be interpreted as LLM instructions.
Apply least privilege. If your tool only needs read access, don't request write access. If it only needs access to one directory, don't request filesystem-wide access.
Protect your tool descriptions. Ensure tool metadata is static and cannot be manipulated at runtime. Tool poisoning (MCP01) is a real attack vector.
Handle security reports constructively. Acknowledge the report. Link duplicates. Assess severity. Respond within a reasonable timeframe. The OWASP Vulnerability Disclosure Cheat Sheet provides a framework.
Scan your server. Run a security scanner against your tool descriptions and common input patterns. Automated scanning catches vulnerabilities before your users find them — or an attacker does.

The cost of doing nothing: The MCP ecosystem has grown from a few hundred servers to 11,500+ in under a year. The Queen's University study found that 7.2% of MCP servers have at least one security vulnerability. As adoption grows, so does the incentive for attackers. The window for proactive security is closing.

Conclusion

The four pushback patterns we've encountered are not unique to MCP. They're the same arguments web application developers made in the early 2000s: "the firewall will catch it," "it's the user's fault," "that's how everyone does it." The industry moved past those arguments by establishing shared responsibility models, secure development frameworks, and vulnerability disclosure programs.

The MCP ecosystem is at that inflection point now. With 11,500+ servers and growing, the surface area for prompt injection, tool poisoning, and privilege escalation is expanding faster than any single layer can address alone.

Security is not a zero-sum game between layers. It's a cumulative defense. Each layer that does its part makes the entire stack harder to exploit. The question is not "whose problem is this?" The answer is: it's everyone's.

Scan Your MCP Server for Free

ClawGuard Shield detects prompt injection, tool poisoning, and privilege escalation patterns in under 10ms. 225 patterns across 15 languages. No account required.

Free Scan GitHub