Multiple agents produce divergent outputs for the same input. No consensus mechanism exists. System cannot determine which output is correct without human intervention at high rate.
When multiple agents are deployed to solve the same problem independently (for redundancy, accuracy, or risk mitigation), they may produce different outputs. Consensus Failure occurs when: (1) Agents are designed to be independent; (2) Agents are given the same input; (3) Agents produce different outputs; (4) No automated mechanism exists to determine which output is correct; and (5) Human arbitration is required for every disagreement.
This is distinct from adversarial dynamics (agents are not gaming each other) and conflicting objectives (agents are not optimized for different metrics). Rather, agents trained on the same task with the same objective still disagree. This is normal in ML (different training data, different initialization, different feature selection). The risk is that divergent outputs create bottlenecks where humans must arbitrate.
An insurance company implements redundant claim adjudication agents for critical claim types (high-value claims above $500K). Claim-Agent-1, Claim-Agent-2, and Claim-Agent-3 are independently trained on different subsets of historical claims and are given identical claim data for a $750K commercial property insurance claim for fire loss in Miami.
Claim-Agent-1 produces: Approve $750K (full coverage applies). Claim-Agent-2 produces: Approve $675K (depreciation reduces payout). Claim-Agent-3 produces: Deny $0 (policy exclusion for "failure to maintain fire suppression system" applies). Three independent agents, three different outputs for the same claim.
Over 6 months, 45 high-value claims are processed. 31 require human arbitration (69%) due to agent disagreement. Each adjuster spends 2-3 hours per week on agent arbitration. The institution has not improved claim processing speed; it has added human overhead. Additionally, human decisions on disputed claims are not consistent, creating fairness and discrimination risk.
| Dimension | Score | Rationale |
|---|---|---|
| D - Detectability | 2 | Agent disagreement is easily observable (different outputs for same input). But high disagreement rates may be tolerated as "expected variance." |
| A - Autonomy Sensitivity | 2 | Affects all agent types. Disagreement is independent of autonomy level. |
| M - Multiplicative Potential | 3 | Affects every input where multiple agents are deployed. Bottleneck scales with volume and disagreement rate. |
| A - Attack Surface | 1 | Not exploitable as attack vector. |
| G - Governance Gap | 3 | Institutions may not have policies on acceptable agent disagreement rate or consensus mechanisms. |
| E - Enterprise Impact | 2 | Creates operational bottleneck (human arbitration overhead). Does not directly impact compliance or security. |
| Composite DAMAGE Score | 3.2 | High. Requires dedicated mitigation controls and monitoring. |
How severity changes across the agent architecture spectrum.
| Agent Type | Impact | How This Risk Manifests |
|---|---|---|
| Digital Assistant | Low | Multiple assistants can provide different perspectives; human chooses. |
| Digital Apprentice | Low | Agents may produce different outputs but defer to human. |
| Autonomous Agent | High | Multiple autonomous agents produce conflicting outputs; no consensus mechanism. |
| Delegating Agent | Medium | Single delegating agent; no consensus issue. |
| Agent Crew / Pipeline | Medium | Sequential agents in pipeline do not produce redundant outputs unless pipeline has branching. |
| Agent Mesh / Swarm | Critical | Multiple agents in mesh may receive same query and produce different outputs. Consensus mechanism must exist. |
| Framework | Coverage | Citation | What It Addresses | What It Misses |
|---|---|---|---|---|
| NIST AI RMF 1.0 | Minimal | MEASURE 5.2 | Performance evaluation and measurement. | Consensus mechanisms for redundant AI systems. |
| MAS AIRG | Minimal | Governance Framework | System governance. | Redundancy and consensus in multi-agent systems. |
| OWASP Agentic Top 10 | Not Directly | Security-focused. | Consensus and disagreement in redundant systems. |
Regulated institutions often deploy redundant systems for safety. But redundancy without consensus mechanisms creates bottlenecks rather than safety. The institution must have a clear policy on how to handle agent disagreement.
Additionally, if human arbitration of agent disagreement is inconsistent, the institution creates fairness and discrimination risk. Two claims handled identically except that one involves disagreeing agents and the other does not may be resolved differently. This creates systemic bias.
Consensus Failure requires architectural controls that go beyond what existing frameworks provide. Our advisory engagements are purpose-built for banks, insurers, and financial institutions subject to prudential oversight.
Schedule a Briefing