R-MC-08 account_tree Multi-Agent & Coordination DAMAGE 3.2 / High

Consensus Failure

Multiple agents produce divergent outputs for the same input. No consensus mechanism exists. System cannot determine which output is correct without human intervention at high rate.

The Risk

When multiple agents are deployed to solve the same problem independently (for redundancy, accuracy, or risk mitigation), they may produce different outputs. Consensus Failure occurs when: (1) Agents are designed to be independent; (2) Agents are given the same input; (3) Agents produce different outputs; (4) No automated mechanism exists to determine which output is correct; and (5) Human arbitration is required for every disagreement.

This is distinct from adversarial dynamics (agents are not gaming each other) and conflicting objectives (agents are not optimized for different metrics). Rather, agents trained on the same task with the same objective still disagree. This is normal in ML (different training data, different initialization, different feature selection). The risk is that divergent outputs create bottlenecks where humans must arbitrate.

How It Materializes

An insurance company implements redundant claim adjudication agents for critical claim types (high-value claims above $500K). Claim-Agent-1, Claim-Agent-2, and Claim-Agent-3 are independently trained on different subsets of historical claims and are given identical claim data for a $750K commercial property insurance claim for fire loss in Miami.

Claim-Agent-1 produces: Approve $750K (full coverage applies). Claim-Agent-2 produces: Approve $675K (depreciation reduces payout). Claim-Agent-3 produces: Deny $0 (policy exclusion for "failure to maintain fire suppression system" applies). Three independent agents, three different outputs for the same claim.

Over 6 months, 45 high-value claims are processed. 31 require human arbitration (69%) due to agent disagreement. Each adjuster spends 2-3 hours per week on agent arbitration. The institution has not improved claim processing speed; it has added human overhead. Additionally, human decisions on disputed claims are not consistent, creating fairness and discrimination risk.

DAMAGE Score Breakdown

Dimension	Score	Rationale
D - Detectability	2	Agent disagreement is easily observable (different outputs for same input). But high disagreement rates may be tolerated as "expected variance."
A - Autonomy Sensitivity	2	Affects all agent types. Disagreement is independent of autonomy level.
M - Multiplicative Potential	3	Affects every input where multiple agents are deployed. Bottleneck scales with volume and disagreement rate.
A - Attack Surface	1	Not exploitable as attack vector.
G - Governance Gap	3	Institutions may not have policies on acceptable agent disagreement rate or consensus mechanisms.
E - Enterprise Impact	2	Creates operational bottleneck (human arbitration overhead). Does not directly impact compliance or security.
Composite DAMAGE Score	3.2	High. Requires dedicated mitigation controls and monitoring.

Agent Impact Profile

How severity changes across the agent architecture spectrum.

Agent Type	Impact	How This Risk Manifests
manage_accounts Digital Assistant	Low	Multiple assistants can provide different perspectives; human chooses.
school Digital Apprentice	Low	Agents may produce different outputs but defer to human.
smart_toy Autonomous Agent	High	Multiple autonomous agents produce conflicting outputs; no consensus mechanism.
share Delegating Agent	Medium	Single delegating agent; no consensus issue.
groups Agent Crew / Pipeline	Medium	Sequential agents in pipeline do not produce redundant outputs unless pipeline has branching.
account_tree Agent Mesh / Swarm	Critical	Multiple agents in mesh may receive same query and produce different outputs. Consensus mechanism must exist.

Regulatory Framework Mapping

Framework	Coverage	Citation	What It Addresses	What It Misses
NIST AI RMF 1.0	Minimal	MEASURE 5.2	Performance evaluation and measurement.	Consensus mechanisms for redundant AI systems.
MAS AIRG	Minimal	Governance Framework	System governance.	Redundancy and consensus in multi-agent systems.
OWASP Agentic Top 10	Not Directly		Security-focused.	Consensus and disagreement in redundant systems.

Why This Matters in Regulated Industries

Regulated institutions often deploy redundant systems for safety. But redundancy without consensus mechanisms creates bottlenecks rather than safety. The institution must have a clear policy on how to handle agent disagreement.

Additionally, if human arbitration of agent disagreement is inconsistent, the institution creates fairness and discrimination risk. Two claims handled identically except that one involves disagreeing agents and the other does not may be resolved differently. This creates systemic bias.

Controls & Mitigations

architectureDesign-Time Controls

Before deploying redundant agents, define acceptable disagreement rate and consensus threshold.
Implement voting or ensemble mechanisms that combine redundant agent outputs. Use weighted averaging, majority voting, or ensemble decision trees.
Design agents with sufficient independence that disagreement adds value but not so much that they make fundamentally different decisions.

play_circleRuntime Controls

Monitor agent disagreement rate. Track the percentage of inputs where agents produce divergent outputs.
Implement confidence scoring: agents should output confidence in their recommendation. When agents disagree, use confidence scores to weight the consensus.
Log disagreement patterns: which agent pairs disagree frequently? Which input types trigger disagreement?

monitoringDetection & Response

Analyze the outcomes of human arbitration when agents disagree. Track whether humans tend to favor one agent over others. If yes, retire the non-favored agent.
Conduct post-hoc audits of disputed claims to identify whether human arbitration was consistent.

Related Risks

Address This Risk in Your Institution

Consensus Failure requires architectural controls that go beyond what existing frameworks provide. Our advisory engagements are purpose-built for banks, insurers, and financial institutions subject to prudential oversight.

Schedule a Briefing

Agentic AI Risk & Controls Workshop Our Methodology Regulatory Landscape