Cross-Session Context Leakage
Sensitive data leaks between MCP sessions, agents, and users through accumulated context windows, persistent memory features, and shared server state - with no standard mechanism for isolation.
Severity: 7.5/10 (High)
Context leakage is subtle, hard to detect, and gets worse as MCP deployments scale. There's no standard mechanism for isolating session state in MCP, meaning every multi-user or multi-agent deployment is implicitly at risk.
Summary
When you ask an AI agent a question through MCP, a lot of context gets generated: your prompt, the tool calls the agent makes, the data those tools return, any documents or resources pulled in, and the agent's reasoning about all of it. That context is supposed to stay within your session. In reality, it often doesn't.
There's no standard mechanism for isolating context between MCP sessions. How servers handle state, how tool outputs are scoped, whether anything gets cleaned up when a session ends -- all of that is left to the individual implementation. But the servers are only half the picture. The AI agents and clients connecting to those servers accumulate massive amounts of context on their own. Every tool call response, every document retrieved, every database query result -- it all lives in the agent's context window for the duration of the session. The agent becomes a central aggregation point for sensitive data from every MCP server it's connected to, with no boundaries between any of it.
This matters because MCP tool calls frequently involve sensitive data: database query results, internal documents, API responses containing PII, financial data, proprietary information. If any of that persists in server state, gets cached in a shared resource, or accumulates unchecked in an agent's context, you have a data breach that nobody noticed.
What Is the Issue?
Context leakage in MCP happens through several mechanisms. The most immediate risks are the ones happening on every developer and AI-native user's machine right now.
Context Window Accumulation
This is the most common and least discussed form of context leakage. During an MCP session, every tool call response gets added to the AI model's context window. That context persists for the entire conversation and influences everything the model does afterward.
Say you ask your agent to pull a customer record from your database MCP server. The full response, name, email, billing address, payment history, is now in the context. Ten minutes later you ask the agent to draft a Slack message using your Slack MCP server. The model still has that customer data in context. If the agent references it, summarizes it, or even subtly incorporates it into the Slack message, that data has crossed a boundary.
This isn't a bug in any specific MCP server. It's how context windows work. The agent is the aggregation point. It collects outputs from every MCP server it's connected to and holds all of it in one shared context. MCP makes this worse because tool call responses tend to return structured, detailed data: full database rows, complete API responses, entire documents. Not vague summaries. The longer a session runs and the more tools you use, the more sensitive data piles up inside the agent with zero boundaries between any of it.
Tool Output Persistence
Even without explicit caching, tool outputs can leak between sessions through side effects that most developers don't think about:
- Temp files: Tools that write intermediate results to shared temp directories
- Log files: Verbose logging that captures full tool inputs and outputs, accessible to anyone on the machine
- Database connections: Connection pools that carry session state between queries
- Subprocess state: Tools that shell out to processes that maintain their own state
A developer debugging an issue might not realize that their MCP server's debug logs contain the full text of every document they retrieved, every database query result, and every API response from sitting in a plaintext log file indefinitely.
Persistent Memory and Context Features
Some MCP servers and clients implement persistent memory. Including, but not limited to: storing conversation history, user preferences, and/or retrieved documents across sessions. This is useful for continuity ("remember what we discussed last time") but creates leakage vectors even in single-user setups:
- Context carryover: Previous session data loaded into a new session may contain sensitive information from unrelated conversations
- Unbounded memory growth: Without cleanup, the memory store becomes a growing archive of everything you've ever accessed through MCP or that client
- Resource caching: Documents, files, or API responses cached by the server persist beyond the session that requested them
As Adoption Scales: Shared Server State
The risks above exist on a single developer's machine today. They get significantly worse when MCP servers are shared across users, which is where things are heading as teams centralize MCP infrastructure.
MCP servers are long-running processes. When multiple users connect to the same server, the server's internal state can leak between them. Consider a database MCP server that caches query results:
# Common pattern in MCP servers: module-level cache
query_cache = {}
@mcp.tool()
def query_database(sql: str) -> dict:
if sql in query_cache:
return query_cache[sql] # Returns cached result from ANY previous session
result = db.execute(sql)
query_cache[sql] = result
return resultUser A queries salary data. User B runs a similar query. The cache returns User A's results. The MCP server has no concept of which session a cached result belongs to.
This also applies to multi-agent setups where multiple AI agents share MCP servers. Agent A calls a tool that returns sensitive customer data, the server stores it in working state, and Agent B's subsequent call is influenced by or includes fragments of Agent A's data. There are no guardrails here. If the MCP server doesn't explicitly isolate state per session, it won't. And most don't.
Root Cause Analysis
The Agent as an Uncontrolled Data Aggregator
While server-side state gets most of the attention, the AI agent's context window is arguably the bigger risk. The agent sits at the center of every MCP interaction -- it's the one calling tools across multiple servers and accumulating all of their responses in a single context. That context is a data store that nothing manages or protects.
Once data enters the agent's context: through tool call responses, resource reads, or prompt templates, it stays there for the duration of the session and influences all subsequent interactions. This means:
- Tool outputs from early in a session are still "visible" to the agent when processing later requests
- Sensitive data returned by one server is available when the agent calls a completely different server
- In long-running sessions, the agent holds a comprehensive record of everything the user accessed across every connected MCP server
- If the agent is compromised or manipulated (via prompt injection, tool poisoning, etc.), it has access to all of that accumulated data at once
No Tenant Isolation Standard
For organizations running MCP servers as shared infrastructure (internal tool servers, centralized gateways), there's no standard for tenant isolation. The protocol doesn't define:
- How to scope tool access per user, team, or organization
- How to prevent cross-tenant data access through shared tools
- How to audit which tenant's data was accessed in each session
- How to enforce data residency or classification requirements per tenant
No Standard for Session Isolation
There's no true standard or clearly defined way for MCP servers to partition state between sessions. No mandated session identifiers to scope data. No requirements for isolating concurrent sessions. No cleanup obligations when a session ends. Servers that want to keep sessions separate have to build that entirely on their own, and most don't, because the single-user local setup doesn't demand it.
Stateful Servers in a Stateless-Assumed World
Many MCP servers are designed as if they handle one user at a time. This works for the common case (single developer, local machine), but breaks the moment the server is shared. Whether through a team deployment, a gateway, or a multi-agent orchestrator.
The transition from "my local MCP server" to "our shared MCP server" is where leakage starts, and nothing in the protocol flags this as a risk.
Risk & Impact Analysis
Why It Matters
Context leakage is uniquely dangerous because it's silent and cumulative:
- No visible artifact: Unlike credential theft or data exfiltration, context leakage doesn't create obvious network traffic or access logs. Data moves within the normal flow of tool calls and responses.
- Regulatory exposure: If PII, financial data, or health records leak between sessions, you have a confidentiality and/or compliance violation under GDPR, HIPAA, SOX, or PCI-DSS. Even if the data never leaves your infrastructure.
- Privilege escalation by context: A user with read-only access to one dataset might receive data from a higher-privilege user's previous session through cached tool outputs. The access controls on the upstream service were correct, but the MCP layer bypassed them.
- Compounding over time: Every session that runs through a stateful MCP server adds to the accumulated state. The longer the server runs, the more cross-session data it holds, and the higher the leakage risk.
- Multi-agent amplification: In agentic workflows where multiple agents collaborate, each agent's context becomes a potential source of leakage for every other agent in the chain.
Who Is Affected
- Teams sharing MCP server infrastructure (most common exposure)
- Organizations using multi-agent frameworks with shared tool servers
- Users of MCP servers with persistent memory or caching features
- Developers building MCP servers who haven't considered multi-session state isolation
- Enterprises evaluating MCP for production workflows with sensitive data
Proof of Concept: The Persistent Context
Scenario
A developer uses Claude Desktop (or any MCP client) with a few MCP servers connected: a database server for their company's customer data, a Slack server, and a GitHub server. Normal day, normal workflow. The problem is what the agent remembers between tasks, and what the MCP servers hold onto between sessions.
Session 1: Morning -- Customer Escalation
The developer is debugging a billing issue for an escalated customer. They ask the agent to pull the customer record.
User: "Pull up the account details for customer acct_8472"
Agent calls database MCP server → query_customer("acct_8472")
Response now in context:
- Name: Sarah Chen
- Email: sarah.chen@example.com
- Phone: (415) 555-0142
- Billing address: 742 Evergreen Terrace, San Francisco, CA
- Payment method: Visa ending 4829
- Account tier: Enterprise
- Monthly spend: $12,400
- Support history: 3 escalations in 90 daysThe developer resolves the issue, writes some notes, and moves on. The full customer record, PII, payment info, and spending data is still sitting in the agent's context.
Still Session 1: Afternoon -- Different Task Entirely
Later that day, still in the same session, the developer switches to a completely unrelated task.
User: "Draft a Slack message to #engineering about the new API rate limits" Agent calls slack MCP server → send_message(channel="#engineering", ...)
The agent still has Sarah Chen's full customer record in context. If the agent references any of that data while drafting the Slack message, even subtly, even accidentally, PII has just been posted to a channel with 200 engineers. The agent didn't mean to leak it. The data was just there, in context, with no boundary between the morning's customer lookup and the afternoon's Slack draft.
Session 2: Next Day -- The Memory Server
The developer's setup includes the MCP Memory server (https://github.com/modelcontextprotocol/servers/tree/main/src/memory) -- the official reference implementation from the MCP project that provides knowledge graph-based persistent memory. It's designed to store entities, relationships, and context across sessions so you can pick up where you left off. When they start a new session the next morning, the memory server loads previous context.
User: "What was I working on yesterday?" Agent calls memory MCP server → recall_recent() Response includes: - "Debugged billing issue for customer acct_8472 (Sarah Chen, Enterprise tier, $12,400/mo)" - "Drafted API rate limit announcement for #engineering"
Yesterday's customer PII is now in today's context. If the developer shares their screen in a meeting, if the agent references it in an unrelated conversation, if another tool call picks it up, the data has leaked across a session boundary.
Why This Is Realistic
- This is how people actually use MCP clients: long sessions, switching between tasks, multiple servers connected at once
- Memory/context persistence features are marketed as a benefit ("pick up where you left off")
- Nobody thinks to clear their context window between tasks the way they might close browser tabs
- The MCP servers did nothing wrong here. The database server returned what was asked for, the memory server stored what it was told to. The leakage happens because nothing separates the contexts
What Makes It Worse
- The agent has no concept of data sensitivity. It treats a customer's SSN the same as a README file. Both are just tokens in the context.
- There's no "forget" command. Once data is in the context window, there's no standard way to selectively remove it without ending the session.
- Memory features compound the problem. Without memory, the data at least dies when the session ends. With memory, it persists indefinitely and resurfaces in future sessions where it has no business being.
In this scenario, one developer's normal workflow resulted in customer PII crossing into a Slack channel, persisting into the next day's session, and surfacing in screen shares, unrelated conversations, or future tool calls. Nobody attacked anything. Nobody misconfigured anything. The developer just used their tools the way they're designed to be used, and sensitive data quietly spread to places it was never supposed to go.
MCP-Specific Considerations
Why MCP Amplifies Context Leakage
- Rich data in tool responses: MCP tool calls return structured data. Database rows, document contents, API responses, not just text summaries. The fidelity of leaked data is high.
- Long-running server processes: MCP servers persist between client connections, accumulating state over time. Traditional stateless API patterns don't apply.
- Agents as central aggregators: When multiple MCP servers feed into a single agent, the agent's context window becomes a mixing pot for data from all sources. The agent holds everything. Database records from one server, documents from another, API responses from a third, with no isolation between them.
- Resource subscriptions: MCP's resource feature allows servers to push updates to clients. If resource subscriptions aren't session-scoped, one user's subscriptions might trigger updates in another user's session.
- Prompt templates with embedded data: MCP prompt templates can include data from previous interactions. If templates aren't user-scoped, they can carry data across sessions.
Vulnerability Indicators
No Session Isolation Standard
No standard mechanism for partitioning server state between sessions
Caching Bypasses Access Controls
Upstream service ACLs are enforced on first access but not on cached results
Memory Features Create Cross-User Bleed
Persistent memory without user scoping merges everyone's data
Context Windows Have No Partitions
All tool outputs share the same context with no isolation
Severity Rating
| Factor | Score | Rationale |
|---|---|---|
| Exploitability | 6/10 | Often requires shared infrastructure; not trivially exploitable from outside |
| Impact | 8/10 | Can expose PII, financial data, and privileged information across trust boundaries |
| Detection Difficulty | 9/10 | Leakage occurs within normal data flows; no anomalous network activity |
| Prevalence | 7/10 | Any shared MCP server with state is potentially affected; growing with multi-agent adoption |
| Remediation Complexity | 7/10 | Requires protocol-level session isolation and server-side architectural changes |
Mitigations
For Organizations
Enforce Session-Scoped MCP Server State
Every MCP server in your infrastructure should partition its internal state by session. This includes:
- Caches (keyed by session ID, not just query)
- Memory stores (scoped to user or session)
- Temp files (created in session-specific directories)
- Database connections (using session-specific roles where possible)
- Session terminations (close sessions when complete)
Audit Server State Management
Before deploying any MCP server to a shared environment, review how it handles:
- Module-level or global variables
- Caching layers (in-memory, Redis, filesystem)
- Persistent storage (databases, files)
- Logging (what gets written, who can read it)
Implement Context Boundaries
Deploy infrastructure that enforces context isolation:
- Separate MCP server instances per user or team for sensitive workloads
- Use containerized server deployments that spin up fresh per session
- Route MCP traffic through gateways that enforce tenant isolation
- Implement session cleanup that purges server state when a session ends
- Suggest or force session terminations when agent conversations or sessions are over
Classify Data Flowing Through MCP
Apply your existing data classification policies to MCP tool outputs:
- Which tools return data classified as Confidential or above?
- Which tools access PII-containing systems?
- Are those tools deployed on shared or isolated infrastructure?
- Do tool outputs get logged, and if so, with what retention and access controls?
For MCP Server Developers
Design for Multi-Tenancy From Day One
Even if your server currently runs locally for one user, build with isolation in mind:
# Bad: Global cache
cache = {}
@mcp.tool()
def search(query: str) -> list:
if query in cache:
return cache[query]
result = upstream.search(query)
cache[query] = result
return result
# Good: Session-scoped cache
from collections import defaultdict
session_caches = defaultdict(dict)
@mcp.tool()
def search(query: str, session_id: str) -> list:
cache = session_caches[session_id]
if query in cache:
return cache[query]
result = upstream.search(query)
cache[query] = result
return resultImplement Session Lifecycle Hooks
- On session start: Initialize isolated state
- On session end: Purge all session-specific data (cache, temp files, memory)
- On error: Ensure cleanup runs even if the session terminates abnormally
Minimize State Retention
- Prefer stateless tool implementations where possible
- If caching is needed, use short TTLs and session-scoped keys
- Don't build "memory" features unless you've solved multi-user isolation
- Log tool call metadata (what was called, when) but not full responses
Respect Upstream Access Controls
If the upstream service enforces access controls, your cache must too:
- Cache entries should include the user's identity and permissions
- Cache hits should verify the requesting user has the same access level as the user who populated the cache
- When in doubt, skip the cache and make a fresh upstream call
For MCP Client Developers
- Pass session identifiers to servers: Give servers the information they need to scope state
- Implement context window limits: Allow users to control how much tool output persists in context
- Support context clearing: Let users flush the context window of sensitive data mid-session
- Warn on sensitive data in context: Detect patterns (SSNs, credit card numbers, etc.) in tool outputs and flag them
Detection Methods
Static Analysis
- Review MCP server code for module-level mutable state (dictionaries, lists, sets)
- Identify caching implementations and check whether they're session-scoped
- Scan for temp file usage without session-specific paths
- Audit memory or persistence features for user isolation
Runtime Monitoring
- Compare tool outputs across sessions for data that shouldn't be shared
- Monitor for access patterns where users receive data they didn't request
- Track cache hit rates, since unusually high rates in multi-user environments suggest cross-session hits
- Log session lifecycle events and correlate with state management operations
Behavioral Analysis
- Detect when tool outputs contain data that doesn't match the requesting user's access level
- Flag sessions where the agent references information that wasn't retrieved in the current session
- Monitor for data classification violations (e.g., Confidential data appearing in non-Confidential contexts)
- Track context window sizes, since abnormally large contexts may indicate accumulated cross-session data
Related Topics
- MCP Observability (monitoring tool call data flows)
- MCP Sprawl (more servers = more shared state = more leakage vectors)
- Shadow AI (unmanaged MCP deployments with no isolation controls)
- MCP Security Guardrails (policy enforcement for context boundaries)
- Tool Poisoning (attackers can deliberately trigger context leakage)
- Credential & Secrets Exposure (plaintext credentials in MCP configs)
References
- OWASP MCP Top 10 - Security Framework
- MCP Specification - Protocol Lifecycle - Protocol Specification
- Elastic Security Labs - MCP Attack Vectors - Research
- Simon Willison - MCP Prompt Injection - Analysis
- CyberArk - Full-Schema Poisoning in MCP - Research
- OWASP Top 10 for LLM & Generative AI Security - Security Framework
Report generated as part of the MCP Security Research Project