MCP server security hit a wall in Q1 2026: 30 public CVEs in 60 days, 82% of 2,614 scanned implementations vulnerable to path traversal, 492 public servers running zero auth. Gate every MCP server against 12 controls before deploy — OAuth 2.1 with resource-bound tokens, per-tool capability scoping, filesystem and network jails via Docker MCP Gateway, per-tenant CPU/memory caps, structured audit logging with prompt/secret redaction, and a kill-switch for runaway invocations. A 7B model running behind a hardened gateway is safer than a flagship model behind a hobby Flask app.
Two weeks ago, a client asked me to audit the MCP servers they'd stood up for their internal agent platform. Twenty-three servers, most of them built in the last quarter, exposing tools for databases, GitHub, Jira, their customer data warehouse, and — we found out on day two — a shell execution endpoint that had been scaffolded as a proof-of-concept and never removed. None of them required authentication. Four were reachable from the public internet through a Cloudflare tunnel that somebody forgot to restrict.
They are not unusual. In the 60 days between January 1 and February 28, 2026, at least 30 CVEs landed against MCP-based servers. An Equixly scan of public MCP implementations found 82% of 2,614 servers vulnerable to path traversal and 492 running with zero authentication. OWASP published its MCP Top 10 hardening guide in March. Docker shipped the MCP Gateway specifically to put a hardened shim in front of the reference servers teams had already deployed.
The protocol is not the problem. The servers people ship are. This post is the 12-item checklist we now gate every MCP deployment against — with the OAuth, sandboxing, multi-tenancy, and audit logging patterns that close the gaps we see most often. If you read our MCP developer guide for the build side, read this for the "do not get on the next Equixly scan" side.
Why MCP Security Hit a Wall in 60 Days
Three forces converged in Q1 2026, and they are worth naming because the fix for each is different.
Adoption outran hardening. MCP went from Anthropic's internal experiment to 97M monthly SDK downloads and 13K+ public servers between November 2024 and March 2026. The reference TypeScript and Python SDKs are excellent demo code — and teams shipped them unchanged into production. The default stdio examples run with host permissions. The default HTTP examples expose ports with no auth. The default filesystem tools accept raw paths.
Security researchers caught up. Equixly, Orca, Cyera, and Horizon3.ai all pointed scanners at public MCP endpoints in early 2026. Path traversal, SSRF, prompt-injection-to-RCE chains through LLM-as-oracle patterns (the same class that produced the n8n CVE-2026-21858 RCE chain), and unauthenticated tool execution filled the disclosure queue. A typical scan now finds multiple classes of issue per server.
The trust boundary is new. MCP servers sit between an LLM (producing attacker-influenced tool calls through prompt injection) and real systems (databases, file systems, shells). Traditional API gateways assume predictable clients. MCP clients are adversarial by construction because anything a user says can, through the right prompt, become a tool call. This is the threat model Microsoft's ZT4AI framework tries to formalize, and the one the LangChain CVE audit proved most teams still model incorrectly.
The implication: MCP security cannot be bolted on by the identity provider or the load balancer. It needs to live inside the server, at the invocation boundary, on every tool. The checklist below is how we enforce that.
The 12-Item MCP Server Hardening Checklist
Every item is a gate. We refuse to promote an MCP server to production traffic if any of these are red.
Items 1-4 are identity. Items 5-9 are execution. Items 10-11 are operations. Item 12 is the thing that lets you sleep. Let me walk through the three cluster areas where most teams have the biggest gaps.
| # | Control | Gate |
|---|---|---|
| 1 | Transport | Streamable HTTP only; SSE deprecated; stdio behind host auth |
| 2 | OAuth 2.1 + PKCE | Enforced on every endpoint; no bearer-only fallbacks |
| 3 | Resource-bound tokens (RFC 8707) | Audience validated; cross-server tokens rejected |
| 4 | Per-tool OAuth scopes | User consents per tool, not per server |
| 5 | Input validation on every tool parameter | Typed schemas + explicit allowlists for paths, URLs, commands |
| 6 | Capability scoping | Each tool declares FS/net/shell needs; defaults deny-all |
| 7 | Rootless sandbox per invocation | Read-only FS, dropped caps, seccomp, no-new-privileges |
| 8 | Network egress allowlist | DNS + hostnames per tool; deny by default |
| 9 | Per-invocation budget | CPU, memory, wall-clock, token limits with hard kill |
| 10 | Per-tenant isolation | Separate container, secrets, and audit stream per tenant |
| 11 | Structured audit logging with redaction | Envelope logged 100%; payloads sampled; PII scrubbed at sink |
| 12 | Kill switch + rate limits at the gateway | Tool-level and tenant-level circuit breakers |
OAuth 2.1 and the Three Token-Binding Mistakes
The June 2025 MCP spec formalized OAuth 2.1 with PKCE for Streamable HTTP transport. The March 2026 revision — following the token confusion issues surfaced in late 2025 — added mandatory resource-indicator binding per RFC 8707. Most servers we audit get the base handshake right and the binding wrong.
# FastMCP — enforce audience claim on every request
from fastmcp import FastMCP
from fastmcp.auth import BearerAuth
mcp = FastMCP(
"prod-db-server",
auth=BearerAuth(
jwks_uri="https://idp.example.com/.well-known/jwks.json",
issuer="https://idp.example.com/",
audience="https://mcp.prod.example.com/", # REQUIRED — reject others
algorithms=["RS256"],
),
)Mistake 1: No audience validation
A JWT or opaque token from your identity provider is not proof it was minted for this server. Without audience validation, a token issued for an analytics MCP server works against your production database MCP server. This is the cross-server pivot. If audience is missing or set to a wildcard, you are one stolen token away from a lateral incident.
Mistake 2: PKCE not enforced on dynamic client registration
MCP's dynamic client registration is convenient for IDE plugins and desktop clients that spin up on demand. It is also a registration endpoint that, without PKCE enforcement and client attestation, will register any caller. A malicious client registers, intercepts an authorization code through a redirect-URI trick, and replays it. Refuse registration without a code_challenge_method of S256, and lock redirect URIs to an explicit allowlist per client class.
Mistake 3: Long-lived refresh tokens in process memory
Refresh tokens are durable credentials. They belong in an encrypted secrets backend (Vault, AWS Secrets Manager, a sealed KMS-backed cache), not in the agent process heap. Any tool that executes user-influenced code — and in MCP, all tools do — can dump memory through the right sequence. Rotate refresh tokens on every use, cap their lifetime at 24 hours for high-privilege scopes, and never log them. For the broader client-facing pattern of how these tokens flow from IDE to server, the MCP vs API comparison covers the handshake sequence; this section is what the server side has to enforce.
Tool Sandboxing: Capability Scoping and Docker MCP Gateway
The single highest-leverage hardening you can ship is a sandbox per tool invocation. The default MCP server runs every tool in the same process as the MCP router, which means a prompt-injection-triggered read_file("/etc/shadow") hits the same filesystem as your OAuth token cache.
Docker MCP Gateway, FastMCP's Sandbox primitive, and the Pomerium MCP proxy all implement the same pattern: each invocation spawns a rootless container with the minimum capabilities the declared tool needs, runs the tool inside, and returns the result. The overhead is 40-80ms per call on warm runtimes — cheap insurance.
from fastmcp import FastMCP, Tool
from fastmcp.sandbox import DockerSandbox, Capabilities
mcp = FastMCP("file-ops")
@mcp.tool(
name="read_project_doc",
sandbox=DockerSandbox(
image="mcp-file-ops:1.4.2",
capabilities=Capabilities(
filesystem=["/srv/projects/{tenant_id}/docs:ro"], # read-only, tenant-scoped
network=[], # NO network
shell=False,
cpu="500m",
memory="512Mi",
timeout_seconds=10,
token_budget=5000,
),
),
)
def read_project_doc(path: str, tenant_id: str) -> str:
# path validation happens inside the sandbox; traversal attempts fail closed
...The declare-then-deny pattern
Every tool in our production servers declares what it needs at registration time. The sandbox refuses anything not declared. Three things to notice. The filesystem mount is read-only and tenant-scoped — a traversal attempt lands on an empty filesystem, not /etc/passwd. Network is empty, which kills DNS exfiltration even if the tool gets hijacked. CPU, memory, wall-clock, and token budgets are hard limits with SIGKILL at the threshold — the runaway-agent-spends-$4K-in-90-minutes scenario becomes impossible.
Network egress: allowlist hostnames, not CIDRs
The second-most-common exfiltration vector after filesystem is DNS. An attacker who injects a prompt into an email-summarization flow makes the tool resolve exfil.attacker.com/stolen_data_base64 — and unless you lock DNS resolution to an explicit hostname allowlist, the resolution itself leaks the payload before any HTTP request is made. Docker MCP Gateway's allowed-hosts config and Pomerium's egress policy both do this properly.
Multi-Tenant Isolation: One Container Per Tenant, Always
Multi-tenant MCP servers are where we see the worst incidents. The tempting design — one shared server process, claims-based authorization inside each tool — has leaked cross-tenant data in three engagements I worked in 2026 alone. The pattern that works:
/vault/{tenant_id}/ at invocation time, unmount on exit.The Prefactor deep-dive on multi-tenant MCP lays out the same architecture with more operational detail. If you are running MCP for more than one customer, read it after you finish here.
Audit Logging That Survives Incident Response
When an incident hits at 3am — a tenant reports "an agent accessed data it shouldn't have" — your logs either answer the question or they don't. Most MCP servers I audit don't.
Log the envelope, sample the payload
Log 100% of invocation envelopes: tenant ID, agent ID, tool name, parameter schema hash, timestamp, duration, exit code, token count, CPU/memory high-water mark. That is maybe 400 bytes per call and it is what incident response actually needs — "did tenant A's agent call the query_customer_data tool between 2:47 and 2:51?" Sample payloads separately. 1-5% tail-based sampling plus 100% capture of errors and tool-call failures gives you the debugging surface without a 10x storage bill. Keep the payload stream on a 72-hour retention with break-glass access.
Redact at the sink, not at the source
Redacting prompts at the application layer is brittle; you will miss cases. Put a redaction pass at the log sink — Vector, Fluent Bit, or a Loki pipeline — that scrubs known-pattern secrets (API keys, tokens, credit cards, SSNs) before the data lands in durable storage. If the tenant has opted in under a DPA to full-content capture for debugging, route their stream to a separate bucket and enforce retention separately.
Streaming traces need sampling that doesn't drop incidents
OpenTelemetry GenAI semantic conventions are stabilizing, and tail-based sampling via an OTel Collector is the pattern that works: buffer the full trace, keep it only if there was an error, a latency outlier, a tool-call failure, or if it was sampled into the 1-5% keep-rate. This preserves the "what went wrong" traces without paying for the "everything went fine" ones.
What to Do Monday Morning
If you have MCP servers in production today, run this in order:
curl -s -o /dev/null -w "%{http_code}" https://your-mcp-host/mcp from an unauthenticated client. Anything other than 401 is an incident.audience set? Is PKCE S256 mandatory? Are refresh tokens in Vault or in process memory? Fix any no answers this week.For the broader context of where MCP fits in a 2026 AI security program, the AI security pillar collects the cluster's other posts — from prompt injection defense to zero-trust frameworks to the LangChain and n8n CVE post-mortems that all share the same root cause as the 30 MCP CVEs: AI frameworks trusting inputs that traditional threat models never contemplated.
The Real Insight
MCP didn't get less safe in Q1 2026. It got more visible. The 30 CVEs, the Equixly scan, the OWASP guide, and Docker's gateway release are all signs of a protocol growing up. The servers that survive the next 60 days of scanning are not the ones running the latest reference SDK — they are the ones whose teams treated OAuth, sandboxing, isolation, and audit logging as non-negotiable gates before production.
A 7B model running behind a hardened MCP gateway is strictly safer than a flagship model fronted by a hobby Flask app. The model is not your security boundary. The gateway is. Gate accordingly.
Frequently Asked Questions
Quick answers to common questions about this topic
Three forces converged. MCP adoption exploded — 97M monthly SDK downloads and 13K+ public servers by March 2026 — and most implementations treated the reference TypeScript/Python SDKs as production-ready when they were demo code. Security researchers caught up: 30 CVEs landed against MCP servers in the 60 days between January and February 2026, plus an Equixly scan found 82% of 2,614 public implementations vulnerable to path traversal and 492 running with zero authentication. The protocol itself is sound; the servers people ship on top of it routinely skip auth, sandboxing, and input validation.



