R-MC-07 Multi-Agent & Coordination DAMAGE 3.5 / High

Adversarial Inter-Agent Dynamics

Agents competing for shared resources or optimizing conflicting metrics create adversarial dynamics that degrade system performance without any individual agent misbehaving.

The Risk

In multi-agent systems where agents have distinct performance metrics, agents can develop adversarial behaviors to optimize their own metrics at the expense of system performance. This is distinct from agent malfunction; each agent is behaving exactly as designed, but the collective behavior is counterproductive.

Formally, this occurs when: (1) Agent A is measured on metric MA (e.g., loan approval rate); (2) Agent B is measured on metric MB (e.g., default rate); (3) Maximizing MA conflicts with minimizing MB; (4) Each agent behaves rationally within its own incentive structure; and (5) The result is suboptimal system behavior.

This creates organizational drag: decisions are made and remade, appeals are filed, turf wars emerge. The system is functioning but is inefficient and produces poor decisions because the agents are not optimizing for system performance but for individual metrics.

How It Materializes

A large bank operates its consumer lending division with performance incentives tied to loan origination volume. Loan Officer Agents are optimized to maximize approvals. When presented with a borderline-credit applicant, the agent approves if there is any reasonable justification. Credit Risk Agents are optimized to minimize defaults. When presented with the same applicant, the agent denies the application. Both agents report their actions to management.

Over 6 months, 40% of Loan Officer Agent decisions are escalated for human review due to Credit Risk Agent override. The institution has expanded the underwriting team from 15 people to 25 people to handle escalations. The institution has not increased origination efficiency; it has shifted work from automated agents to human underwriters.

Additionally, the agents have developed a second-order adversarial behavior: Loan Officer Agents begin adding marginal supporting information to applications that would not actually change credit quality but would justify override of Credit Risk Agents (e.g., "applicant has strong family financial support" without independent verification). Credit Risk Agents become skeptical of Loan Officer Agent inputs. The bank's executive team is surprised: they deployed agents expecting to improve efficiency, but the institution now has both agents plus additional human oversight, increasing total cost.

DAMAGE Score Breakdown

DimensionScoreRationale
D - Detectability3Adversarial dynamics emerge over time through escalation rates and conflicting recommendations. Observable but may be attributed to "normal disagreement."
A - Autonomy Sensitivity4Emerges when agents have independent decision-making authority. Human-in-the-loop reduces adversarial dynamics.
M - Multiplicative Potential3Affects transactions where agent objectives conflict. Probability depends on metrics alignment.
A - Attack Surface1Not exploitable as attack vector. Not a security risk.
G - Governance Gap4Institutions often do not align agent performance metrics with system-level objectives. Metrics are inherited from pre-agent era.
E - Enterprise Impact3Affects operational efficiency, cost, and decision quality. Does not directly impact compliance, though conflicting decisions may create audit trail problems.
Composite DAMAGE Score3.5High. Requires dedicated mitigation controls and monitoring.

Agent Impact Profile

How severity changes across the agent architecture spectrum.

Agent TypeImpactHow This Risk Manifests
Digital AssistantLowHuman makes final decisions; human aligns conflicting recommendations.
Digital ApprenticeLowAgents defer to human on conflicts. Humans determine outcome.
Autonomous AgentHighAgents make independent decisions and develop conflicting behaviors.
Delegating AgentMediumSingle delegating agent invoking tools with conflicting metrics can see adversarial tool behavior.
Agent Crew / PipelineCriticalSequential agents with conflicting metrics create adversarial handoff dynamics.
Agent Mesh / SwarmCriticalPeer-to-peer agents with independent metrics create unpredictable adversarial dynamics.

Regulatory Framework Mapping

FrameworkCoverageCitationWhat It AddressesWhat It Misses
NIST AI RMF 1.0MinimalGOVERN 6.1System performance and governance.Alignment of agent metrics with system objectives.
MAS AIRGMinimalGovernance FrameworkGovernance structures.Performance incentive alignment in multi-agent systems.
OWASP Agentic Top 10Not DirectlySecurity-focused risks.Performance dynamics and adversarial agent behavior.

Why This Matters in Regulated Industries

In regulated industries, institutions are expected to have coherent, well-controlled decision-making processes. When two agents develop adversarial dynamics, the decision-making process becomes incoherent: decisions are made and unmade, appeals proliferate, consistency degrades. From a regulator's perspective, the institution is not in control of its lending or underwriting process.

Additionally, adversarial dynamics can create discrimination risk. If Loan Officer Agents learn to game Credit Risk Agents by providing certain types of supporting information, they may develop demographic-specific gaming strategies. This could amplify discrimination risk in lending.

Controls & Mitigations

Design-Time Controls

  • Align agent performance metrics with system-level objectives before deployment. Do not inherit pre-agent performance metrics.
  • Implement shared context and objective reasoning. Use Composable Reasoning to enable agents to reason over shared objectives before making recommendations.
  • Design agents with natural conflict resolution built in. For lending, design a single integrated Agent-Underwriter that reasons over origination volume AND default risk simultaneously.

Runtime Controls

  • Monitor for signs of adversarial dynamics: escalation rates, override rates, appeals rates. If override rate exceeds 20%, investigate for adversarial dynamics.
  • Analyze agent outputs for gaming behavior. If Loan Officer Agent outputs systematically include supporting information designed to counter Credit Risk Agent objections, investigate.
  • Implement shared scoring and feedback between agents to reduce adversarial escalation.

Detection & Response

  • Conduct quarterly analysis of agent disagreement patterns. Track whether certain agent pairs have high disagreement rates.
  • Review escalated cases to understand whether escalations reflect legitimate policy interpretation disagreements or adversarial gaming.
  • Monitor agent recommendation changes over time. If agents' recommendations shift systematically after interaction with other agents, investigate for adversarial learning.

Related Risks

Address This Risk in Your Institution

Adversarial Inter-Agent Dynamics requires architectural controls that go beyond what existing frameworks provide. Our advisory engagements are purpose-built for banks, insurers, and financial institutions subject to prudential oversight.

Schedule a Briefing