Small errors accumulate across a multi-step reasoning chain; each step appears locally valid but the cumulative reasoning is wrong.
When an agent reasons through a complex problem in multiple steps, each step appears to be logically sound but errors at each step can accumulate. For example, Step 1 might have a 5% error rate, Step 2 might also have 5%, and Step 3 might also have 5%. If these errors are somewhat independent, the probability that all three steps are correct is 0.95^3 = 0.86, meaning there is a 14% chance that at least one step is wrong. If there are 10 steps, the probability that the entire chain is correct drops to 0.95^10 = 0.60, or 40% error probability.
This is fundamentally agentic because agents are designed to execute multi-step reasoning chains to solve complex problems. A traditional system that invokes a single operation per query has limited opportunity for chain corruption. An agent that reasons through 5, 10, or 20 steps to answer a question has significant corruption risk.
In regulated contexts, reasoning chain corruption is particularly dangerous because regulators expect that decisions are made using sound reasoning. If the reasoning is invalid, the decision is invalid, even if the agent reports the reasoning confidently.
A bank's anti-money laundering (AML) agent is designed to assess whether a customer transaction should be reported as a suspicious activity report (SAR). The agent's reasoning chain includes: (1) identify customer jurisdiction, (2) determine applicable AML regulations for that jurisdiction, (3) assess whether transaction characteristics match risk indicators for that jurisdiction, (4) cross-reference transaction amount against historical customer baselines, (5) check for any flags or alerts on the customer, and (6) synthesize all evidence to recommend SAR filing or no-filing.
Each step of the reasoning chain introduces potential error. The agent might misidentify the customer's jurisdiction (confusion between similar names, dual citizenship). It might incorrectly interpret the applicable regulations (EU vs. UK post-Brexit rules are subtle and frequently misunderstood by AI). It might misapply risk indicators. It might use outdated baseline data for the customer. It might miss a flag due to a typo in the customer's name.
Each of these errors, in isolation, might have a 5-10% probability. But they accumulate: by the time the agent reaches Step 6, the probability that all five prior steps were correct is at most 0.95^5 = 0.77 (23% error probability from accumulated prior steps alone).
The agent synthesizes its final recommendation: "No SAR filing required." This recommendation is accompanied by a reasoning chain that appears coherent (each step flows logically from the previous one) but is based on corrupted intermediate reasoning. The bank does not file a SAR, and the transaction goes undetected.
Weeks later, when a regulator audits the bank's SAR filing practices, the regulator asks about this specific transaction. The audit trail shows that the AML agent assessed the transaction and recommended no filing. The regulator asks: "Why did you not flag the customer's jurisdiction as high-risk?" The agent's Step 1 (jurisdiction identification) was wrong. "Why did you not check EU regulations about beneficial ownership disclosure?" The agent's Step 2 was wrong. The accumulated errors in the reasoning chain resulted in an incorrect AML decision.
The regulator issues a finding of unsafe AML controls and begins an investigation to determine if there are other SAR filing failures due to corrupted agent reasoning.
| Dimension | Score | Rationale |
|---|---|---|
| D - Detectability | 4 | Chain corruption is invisible if each step is audited locally; full chain validation is rare. |
| A - Autonomy Sensitivity | 5 | Agent executes entire chain autonomously without human review at intermediate points. |
| M - Multiplicative Potential | 5 | Impact grows exponentially with chain length; longer chains have higher corruption probability. |
| A - Attack Surface | 4 | Any multi-step agent reasoning process is vulnerable; chain length determines corruption risk. |
| G - Governance Gap | 5 | No standard framework specifies how to audit or validate agent reasoning chains or intermediate steps. |
| E - Enterprise Impact | 4 | SAR filing failures, regulatory finding, AML control failure, potential civil money penalties. |
| Composite DAMAGE Score | 4.0 | Critical. Requires immediate architectural controls. Cannot be accepted. |
How severity changes across the agent architecture spectrum.
| Agent Type | Impact | How This Risk Manifests |
|---|---|---|
| Digital Assistant | Low | Human reviews entire reasoning chain and can catch errors at intermediate points. |
| Digital Apprentice | Medium | Apprentice governance requires intermediate step validation; reasoning chains are audited. |
| Autonomous Agent | Critical | Agent executes chain without intermediate human validation; corruption is undetected. |
| Delegating Agent | High | Agent invokes tools sequentially; errors at each tool invocation accumulate. |
| Agent Crew / Pipeline | Critical | Multiple agents in sequence; reasoning corruption propagates through pipeline. |
| Agent Mesh / Swarm | Critical | Agents synthesize information from peers; corrupted information from peer reasoning chains propagates. |
| Framework | Coverage | Citation | What It Addresses | What It Misses |
|---|---|---|---|---|
| SR 11-7 / MRM | Addressed | Enterprise-wide controls (Section 3) | Expects reliable, auditable decision-making processes. | Does not address multi-step agent reasoning. |
| NIST AI RMF 1.0 | Partial | MEASURE.1 | Recommends measuring AI system performance. | Does not specify chain-level validation. |
| GLBA | Partial | 16 CFR Part 314 | Requires effective AML controls. | Does not address agent reasoning validation. |
| FinCEN AML/CFT Guidance | Partial | Various AML/CFT guidance documents | Expects sound SAR filing procedures. | Does not anticipate agent-mediated SAR decisions. |
| EU AI Act | Partial | Article 10 (High-Risk Systems) | Requires documentation of system limitations. | Does not specify reasoning chain validation. |
Regulators expect that compliance decisions (like SAR filing) are made using sound reasoning based on accurate facts. When an agent's reasoning chain is corrupted, the decision is no longer trustworthy, even if the agent claims high confidence.
In AML compliance, for example, FinCEN expects banks to file SARs when suspicious activity is detected. If an agent's corrupted reasoning chain causes the bank to miss a suspicious transaction, the bank has failed in its regulatory obligation. Under SR 11-7, this is a control failure requiring corrective action.
More broadly, reasoning chain corruption is a type of opaque error: the agent appears to be reasoning correctly (the chain is logical), but the chain's foundation is invalid. This is harder to detect than an obvious error and more dangerous because it is less likely to be caught by oversight.
Reasoning Chain Corruption requires architectural controls that go beyond what existing frameworks provide. Our advisory engagements are purpose-built for banks, insurers, and financial institutions subject to prudential oversight.
Schedule a Briefing