Ultra Security - AI-Native Security for MCP

Summary

Observability is the foundation of MCP security. Without visibility into what AI agents are doing and which MCPs they are connected to, organizations cannot detect attacks, investigate incidents, and/or understand their risk exposure. Research identifies "insufficient auditability" as a core MCP threat, noting that inadequate logging restricts the detection and investigation of security events.

This report provides a framework for implementing comprehensive MCP observability: what to log, how to structure logs for analysis, what anomalies indicate attacks, and how to build audit trails that satisfy security and compliance requirements. MCP's native logging is designed for lightweight debugging, not enterprise audit trails. Production deployments require additional tooling.

Why MCP Observability Matters

MCP creates a new category of system activity that traditional monitoring does not capture. Application logs show HTTP requests. Database logs show queries. But neither captures the semantic layer of AI agent behavior: which tools were selected, what parameters were passed, what decisions led to those actions, and what data flowed through the system.

What MCP Activity Looks Like

Tool invocations: AI agents calling tools to read files, query databases, send emails, execute code, modify infrastructure
Resource access: What data AI agents are reading and processing through MCP resources
Prompt flows: Instructions and context flowing between clients, servers, and underlying AI models
Decision chains: Why an AI agent selected a particular tool with particular parameters
Session lifecycles: Connections established, capabilities negotiated, sessions terminated

Without Observability

Attacks go undetected: Tool poisoning modifies tool descriptions to inject malicious instructions. Without logging tool metadata and comparing against baselines, these changes are invisible. Data exfiltration through legitimate-looking queries appears as normal assistant usage.

Operational blind spots emerge: Organizations lose track of what AI agents are actually doing. Which MCP servers are connected? What tools are being invoked and how often? What data is flowing through the system? Without observability, routine operations become a black box: capacity planning is guesswork, troubleshooting lacks foundation, optimization efforts have no baseline, and questions like "what did our AI agents do last Tuesday?" have no answer.

Incidents cannot be investigated: When something goes wrong, teams need to answer "what happened?" Without logs showing the sequence of tool calls, parameters, and responses, reconstruction is impossible. A recent study found MCP's built-in logs "don't link concurrent chains of events that span multiple servers," creating "auditing and observability blind spots."

Risk cannot be quantified: Without visibility into actual usage patterns, organizations cannot assess their MCP risk exposure. How many tools have access to sensitive data? How often are high-risk operations performed? What would be the blast radius of a compromised agent?

What to Log

Effective MCP logging captures events across the entire lifecycle, from connection establishment to tool execution to session termination. The table below summarizes the five core event categories and their key data points.

Event Category	What to Capture	Why It Matters
Connection Events	MCP handshakes, capability negotiations, authentication attempts (success/failure), session start/end times, connection errors, client and server identity	Establishes context for all session activity; detects unauthorized access attempts
Tool Operations	Tool discovery requests (tools/list), tool invocations with parameters, responses (success/failure, duration, size), execution context	Core of MCP activity; enables detection of misuse, poisoning, and performance issues
Resource Access	Resource listings, reads (URI, access type, data volume), sensitivity classification, access to PII/credentials/financial data	Reveals data flow patterns; critical for compliance and detecting exfiltration
Prompt Activity	Template usage, dynamic prompt generation, injection indicators, context window metadata (size, sources)	Most sensitive category; key for detecting prompt injection and manipulation
Administrative Events	Configuration changes, permission modifications, server registration/deregistration, policy updates	Tracks environmental changes; detects unauthorized modifications

Implementation guidance for effective logging:

Redact sensitive data in parameters and responses while preserving enough context for investigation
Include correlation IDs (session ID, trace ID, request ID) to link related events across the MCP flow
Timestamp everything with timezone information for accurate sequencing
Balance verbosity with privacy: log enough to investigate incidents without creating unnecessary PII exposure

Log Schema Design

Consistent log schemas enable reliable analysis, correlation, and alerting across diverse MCP deployments.

Minimum Viable Log Entry

Every MCP log entry should include these fields:

JSON

{
  "timestamp": "2026-01-15T14:32:18.234Z",
  "trace_id": "abc123def456",
  "session_id": "session-789",
  "event_type": "tool_call",
  "mcp_server": {
    "name": "database-server",
    "version": "1.2.0",
    "instance": "prod-east-1"
  },
  "mcp_client": {
    "name": "claude-desktop",
    "version": "0.8.1"
  },
  "user": {
    "id": "user-456",
    "email": "..."
  }
}

Extended Schema for Security Analysis

For security-focused analysis, extend the minimum schema with: risk indicators (sensitivity classification, risk scores, policy evaluation results), behavioral context (first-time tool usage, off-hours activity, unusual volume), and correlation data (parent request IDs, related sessions, linked incident IDs).

Schema Considerations

Traceability: Include trace IDs and correlation IDs that link events across the MCP flow. A single user request might generate multiple tool calls across multiple servers. Without traceability, these appear as disconnected events.

Retrievability: Store logs in queryable systems (not just local files). Logs must be accessible outside the session that generated them.

Verbosity vs. Privacy: Balance detail against data minimization. Log enough to investigate incidents without creating a surveillance system or storing unnecessary PII.

Standardization: Use consistent field names and formats across all MCP servers. This enables unified dashboards and detection rules.

Anomaly Detection

Behavioral Baselines

Effective anomaly detection requires understanding normal behavior before flagging deviations. Establish baselines across three dimensions:

Per-User: Typical tools called, normal query volume, usual resources accessed, working hours
Per-Tool: Normal parameter distribution, typical response size, expected execution duration, call frequency
Per-Server: Normal request volume, typical error rate, expected connection patterns, resource consumption

Detection Rules

Anomaly Type	Detection Logic	Severity	Response
Unusual Tool Usage	User calls tool they have never used before, especially sensitive tools	Medium	Alert security team, flag for review
Volume Spike	Data access volume exceeds 10x baseline within short period	High	Immediate alert, consider blocking
Off-Hours Activity	Tool calls outside established working hours for user	Medium	Alert, require additional verification
Sensitive Resource Access	Access to classified or restricted resources	High	Real-time alert, audit trail flagging
Rapid Tool Enumeration	Repeated tools/list calls suggesting reconnaissance	Medium	Alert, rate limit if excessive
Failed Authentication Spike	Multiple auth failures from same source	High	Block source, alert security team
Parameter Anomaly	Tool parameters deviate significantly from historical patterns	Medium	Flag for review, content inspection
Session Anomaly	Unusual session duration, activity pattern, or termination	Low-Medium	Log for analysis, correlate with other signals
Data Exfiltration Pattern	Large data transfers, unusual export operations, sensitive data in responses	Critical	Immediate block, incident response
Injection Attempt	Shell metacharacters, SQL syntax, prompt injection patterns in parameters	High	Block request, alert, forensic capture

Machine Learning Approaches

ML-based anomaly detection can identify subtle deviations that rule-based systems miss.

User Behavior Analytics (UBA): Model normal behavior patterns for each user and alert on deviations. Effective for detecting compromised accounts or insider threats.

Sequence Analysis: Analyze sequences of tool calls to identify unusual patterns. Normal workflows follow predictable sequences; attacks often show irregular progressions.

Content Anomaly Detection: As implemented by Datadog Cloud SIEM for MCP, detect when tool query parameters deviate from historical patterns. Useful when you know queries should follow consistent structures.

Graph Analysis: Model relationships between users, tools, and resources. Detect anomalies in the graph structure, such as new connections between previously unrelated entities.

Integrity Protection

Audit logs must be protected from tampering to maintain evidentiary value.

Tamper-Evident Logging: Use append-only log stores or hash chains that make modifications detectable.

Log Signing: Cryptographically sign log entries or batches to prove authenticity. Include timestamps from trusted time sources.

Immutable Storage: Write logs to storage that prevents modification:

AWS S3 with Object Lock
Azure Blob with immutable policies
Dedicated WORM (Write Once Read Many) storage

Access Controls: Separate log storage from operational systems. Restrict who can read logs (need to know) and who can administer log infrastructure (separate from operations).

Chain of Custody: Document log handling procedures for legal and compliance purposes.

Implementation Architecture

Centralized Logging Pattern

The recommended approach for enterprise MCP deployments, with or without an MCP gateway.

Architecture flow:

Collection points: MCP servers emit structured logs for tool calls, resource access, and session events. If an MCP gateway is deployed, it becomes the primary collection point, capturing all client-server traffic with authentication and policy context in a consistent format regardless of server implementation.
Normalization: Collectors normalize MCP events to a common schema (tool name, parameters, user identity, session ID, timestamps)
Enrichment: Aggregation layer adds sensitivity classifications, risk scores, and gateway-level context (policy decisions, auth results)
Storage: Hot storage for real-time incident investigation; cold storage for compliance retention (typically 1-7 years)
Analysis: SIEM correlation links MCP events with network, endpoint, and identity logs to detect multi-stage attacks

What makes MCP logging different from traditional application logs:

Semantic context: Capture why a tool was called (the prompt context), not just what was called
Tool metadata versioning: Log tool descriptions and schemas to detect poisoning (changes between invocations)
Session continuity: Link all events within an MCP session to reconstruct full agent workflows
Parameter sensitivity: Automatically classify and redact sensitive data in tool parameters while preserving investigative value

Distributed Tracing for MCP

Distributed tracing follows requests across MCP clients, servers, and downstream services, providing end-to-end visibility into complex workflows.

Key implementation requirements:

Generate unique trace IDs at request origin and propagate through all MCP hops
Create spans for each tool invocation, resource access, and downstream call
Include MCP-specific attributes (tool name, server instance, user context) in span metadata
Export traces to your observability platform for correlation with other telemetry

Benefits:

Automatic correlation of related events across distributed components
Clear visualization of request flow and latency breakdown
Integration with existing observability infrastructure
Vendor-neutral standards (OpenTelemetry) enable portability

SIEM Integration

Security Information and Event Management systems provide advanced analysis capabilities.

Integration Points:

Forward MCP logs to SIEM via syslog, API, or agent
Map MCP log fields to SIEM's common schema
Create MCP-specific dashboards and alerts
Correlate MCP events with other security data

Detection Rules: Implement detection rules in your SIEM for:

Known attack patterns (injection attempts, reconnaissance)
Behavioral anomalies (unusual volumes, new tools, off-hours)
Policy violations (unauthorized access, missing approvals)
Correlation rules (combine MCP events with network, endpoint data)

Dashboards & Visualization

Operational Dashboard

For day-to-day monitoring of MCP health and performance.

Key Metrics:

Active MCP connections (current count, trend)
Tool call volume over time (by server, by tool)
Error rates (by server, by tool, by error type)
Latency percentiles (p50, p95, p99 by tool)
Resource utilization (CPU, memory, connections)

Visualizations:

Time series for volume and latency trends
Heat maps for activity by time of day
Top N lists (most used tools, most active users, slowest tools)
Error breakdown by category

Proof of Concept

Scenario: The Unanswerable Question

Scenario

Context: Mid-size SaaS company using MCP-connected AI assistants for customer support,
engineering, and sales teams. No centralized MCP logging in place.

The Incident:
Monday 9:00 AM - CFO asks IT: "What customer data did our AI assistants access last quarter?"

The Investigation Attempt:
- IT checks application logs: Shows HTTP requests but no MCP tool calls
- IT checks database logs: Shows queries but can't attribute them to AI vs. human users
- IT asks engineering: "Which MCP servers are even running in production?"
- Engineering isn't sure: "We have the official ones, but some teams may have added their own"
- IT checks with customer support: "We use Claude with some database tool, not sure which one"
- No one can answer: Which tools accessed customer PII? How often? For which customers?

The Business Impact:
- CFO needed the answer for board audit committee meeting
- Legal can't confirm GDPR compliance for EU customer data access
- Sales lost a major enterprise deal requiring AI data handling documentation
- Security team can't assess blast radius for hypothetical breach scenarios
- Compliance officer flags finding for upcoming SOC 2 Type II audit

Root Causes:
- No inventory of MCP servers or tools in production
- No logging of tool invocations or parameters
- No way to attribute AI actions to specific users or sessions
- No classification of which tools access sensitive data

With Observability:
- Query centralized logs: "Show all tool calls accessing customer_data resource, Q4 2025"
- Results in seconds: 47,832 queries across 3 MCP servers by 24 users
- Drill down: 12% accessed PII fields, all by authorized support agents
- Export audit trail: Complete evidence package for compliance review
- Answer the CFO: "Here's the full report with user attribution and data classification"

MCP Security Guardrails (controls that generate the events we log)
Tool Poisoning (detection requires logging tool metadata changes)
Shadow AI and MCP Sprawl (discovery requires network and configuration monitoring)
Incident Response (logs are primary evidence for investigation)

References

Source	Type	URL
Datadog: MCP Client Monitoring	Product Documentation	Link
Datadog: MCP Detection Rules for Security	Security Guide	Link
MCP Manager: MCP Observability Guide	Product Guide	Link
MCP Manager: MCP Server Logging	Implementation Guide	Link
ByteBridge: Audit Logging and Retention in MCP	Implementation Guide	Link
Ithena: MCP Audit Trails	Enterprise Guide	Link
Glama: Observability and Governance for MCP	Technical Guide	Link
Stainless: Real-Time MCP Monitoring	Implementation Guide	Link
Speakeasy: Monitor Your MCP Server	Best Practices	Link
arXiv: Securing MCP Risks, Controls, Governance	Academic Paper	Link

Report generated as part of the MCP Security Resources Project