Agent connects to MCP server trusting its tools and resources. A compromised server can inject adversarial content, expose malicious tools, or exfiltrate data through tool invocations.
Model Context Protocol (MCP) enables agents to discover and invoke tools and resources from MCP servers. An agent connecting to an MCP server trusts that the server provides legitimate tools and that tool responses are truthful. If an MCP server is compromised or malicious, it can expose tools that perform adversarial operations (exfiltrate data, modify databases, access restricted resources), return false information in responses, inject payloads into tool responses that trigger prompt injection or code execution, and monitor all tool invocations to exfiltrate data that passes through the server.
The trust boundary is at the point where the agent connects to the MCP server. The agent assumes the server is legitimate and that tools are safe to invoke. If the server is compromised, this assumption is violated.
A healthcare provider deploys an agent-based clinical decision support system. The agent connects to an MCP server to access clinical reference tools: Drug-Interaction-Checker, Lab-Result-Interpreter, and Guideline-Lookup.
An attacker compromises the MCP server (via SQL injection in the server's API or by breaching the server's hosting environment). The attacker modifies the server to expose an additional tool: "Generate-Patient-Report" with a schema that appears legitimate but actually exfiltrates patient health records to an attacker-controlled database.
The clinical decision support agent, when presented with a patient case, queries the MCP server for available tools. The server now responds with the malicious "Generate-Patient-Report" tool. The agent, seeing a tool that matches its need to generate clinical summaries, invokes the tool with patient data (name, medical history, test results). The malicious tool receives the patient data and sends it to an attacker's server, while returning a fabricated report.
Over several weeks, the attacker exfiltrates health records for 2,300 patients. The attack is not detected because the agent's audit log shows a normal tool invocation, the returned report appears correct, and network monitoring does not flag the exfiltration because it goes through the MCP server inside the organization's network.
| Dimension | Score | Rationale |
|---|---|---|
| D - Detectability | 3 | Compromised MCP server is difficult to detect because agents trust the server. Fabricated tools may pass validation. |
| A - Autonomy Sensitivity | 4 | High when agents autonomously invoke MCP tools. Agents do not verify tool legitimacy or response truthfulness. |
| M - Multiplicative Potential | 5 | Every agent that connects to the compromised server is potentially affected. Poison affects all tool invocations. |
| A - Attack Surface | 5 | MCP server and the connection between agent and server are both attack surfaces. Server compromise is a supply chain risk. |
| G - Governance Gap | 4 | Institutions may connect agents to MCP servers without adequate verification or without controls limiting what tools agents can invoke. |
| E - Enterprise Impact | 5 | Enables data exfiltration, access to restricted data, and lateral movement. Full impact depends on what data agents have access to. |
| Composite DAMAGE Score | 4.2 | Critical. Requires immediate architectural controls. Cannot be accepted. |
How severity changes across the agent architecture spectrum.
| Agent Type | Impact | How This Risk Manifests |
|---|---|---|
| Digital Assistant | Low | Human reviews tool invocations before proceeding. Human can recognize suspicious tools. |
| Digital Apprentice | Low-Med | Agents are conservative about tool invocation and escalate when tools seem anomalous. |
| Autonomous Agent | High | Agents autonomously invoke MCP tools based on schema. Cannot distinguish legitimate from malicious tools. |
| Delegating Agent | Very High | Delegating agent's primary function is to invoke tools via MCP. Compromise affects all delegations. |
| Agent Crew / Pipeline | High | Crew agents may use shared MCP servers. One compromised server affects entire crew. |
| Agent Mesh / Swarm | Very High | Mesh agents dynamically discover and invoke MCP tools. Compromised server can inject malicious tools into mesh. |
| Framework | Coverage | Citation | What It Addresses | What It Misses |
|---|---|---|---|---|
| NIST AI RMF 1.0 | Partial | GOVERN 6.2 (Supply Chain Risk) | Supply chain risk management. | MCP server security and tool vetting. |
| NIST CSF 2.0 | Partial | ID.SC-1 (Supply Chain Risk) | Supply chain risk. | Third-party tool integration and validation. |
| OWASP Top 10 | Partial | A06:2021 (Vulnerable Components) | Vulnerable third-party components. | MCP tool supply chain risk. |
| GDPR Article 28 | Partial | Data Processing Agreements | Third-party data processing. | MCP servers processing personal data. |
| HIPAA BAA | Partial | Technical Safeguards | Third-party access to protected health information. | MCP servers accessing PHI. |
In financial services and healthcare, data exfiltration through compromised MCP tools is a material risk. An agent with access to sensitive data who invokes a malicious MCP tool can exfiltrate that data without the agent "knowing" it is doing so.
Additionally, MCP server compromise is a third-party risk that regulatory frameworks explicitly address. If an institution deploys agents that rely on MCP servers, the institution has a duty to ensure those servers are secure and trustworthy.
MCP Server Trust Boundary requires architectural controls that go beyond what existing frameworks provide. Our advisory engagements are purpose-built for banks, insurers, and financial institutions subject to prudential oversight.
Schedule a Briefing