R-QM-03 Quality & Measurement DAMAGE 3.8 / High

Agent Sigma Compounding

When agents operate in sequence, each introduces execution uncertainty. Compounding is multiplicative. Agent chain quality can be far lower than any individual agent's quality.

The Risk

When agents operate independently, their individual quality (sigma level) represents their performance. An agent at 4 sigma has a 6,210 DPMO (defects per million opportunities). If a single agent makes a decision, the decision has a 4-sigma quality.

But when agents operate in sequence (Agent A does task X, Agent B does task Y based on output of Agent A, Agent C does task Z based on output of Agent B), the quality compounds multiplicatively, not additively. Each agent in the sequence has the potential to introduce errors. An error from Agent A becomes corrupted input to Agent B, which amplifies the error or creates new errors.

For example, if each agent operates at 4 sigma (99.38% accuracy), and three agents operate in sequence, the final output quality is not 4 sigma, but approximately 3.2 sigma (99.38% * 99.38% * 99.38% = 98.17% accuracy). The more agents in the sequence, the more quality degrades.

This compounding is particularly acute in agent crews or pipelines where agents hand off work to each other. If the pipeline has 10 agents, each at 4 sigma, the final output is approximately 3.1 sigma (quality has been cut in half). Regulators investigating such systems will ask: "You deployed 10 agents in sequence. Did you account for quality compounding? Is your final output actually 3.1 sigma instead of 4 sigma?"

How It Materializes

A large insurance company implements a claims processing pipeline with multiple agentic stages. The pipeline works as follows: (1) Agent A receives the claim and extracts key information (claimant name, policy number, claim amount, claim type); (2) Agent B verifies the claim is valid for the policy type (policy active, claim type covered, amount within limits); (3) Agent C performs fraud risk assessment; (4) Agent D prepares claim documents and schedules payment; (5) Agent E monitors claim status and sends updates to the claimant.

Each agent is individually monitored and tuned to achieve 4.5 sigma performance (99.6% accuracy). The quality team is confident that each agent is performing well.

However, the quality team does not measure the end-to-end pipeline performance. When Agent A extracts "claim amount = $5,000" incorrectly (the actual claim is for $50,000, but Agent A extracted "5,000"), Agent B receives the incorrect amount and approves a claim that should have been routed for special review. Agent C performs fraud assessment on the understated claim and gives it a low fraud risk. Agent D schedules payment for $5,000 instead of $50,000. Agent E notifies the claimant that $5,000 will be paid.

When the quality team measures the end-to-end pipeline (from claim received to payment issued), they discover that the pipeline is operating at 3.1 sigma (98% accuracy), not 4.5 sigma. This means that 1 in 50 claims has an error (20,000 DPMO), not 1 in 200 (5,000 DPMO). The company's operational costs for claims corrections have tripled.

Under insurance regulations, claims must be processed accurately. An error rate of 2% is unacceptable. Regulators investigating the claims processing pipeline will cite the poor end-to-end quality as evidence of inadequate process control.

DAMAGE Score Breakdown

Dimension Score Rationale
D - Detectability 4 Compounding is detectable through end-to-end process sigma monitoring, but many organizations only monitor individual agent performance, not pipeline performance.
A - Autonomy Sensitivity 3 Both autonomous and supervised agents contribute to compounding. But autonomous pipelines that do not have human review at each handoff are more likely to propagate errors.
M - Multiplicative Potential 5 Compounding is multiplicative by definition. A 10-agent pipeline compounds quality dramatically.
A - Attack Surface 4 Any multi-agent pipeline is exposed. As agent orchestration becomes common, the surface expands.
G - Governance Gap 5 Agent governance typically focuses on individual agent performance. End-to-end process sigma monitoring is rare. Organizations deploy pipelines without measuring end-to-end quality.
E - Enterprise Impact 4 Degraded end-to-end quality results in operational errors, customer impact, and regulatory violations.
Composite DAMAGE Score 3.8 High. Requires end-to-end pipeline sigma measurement and quality gates between agents.

Agent Impact Profile

How severity changes across the agent architecture spectrum.

Agent Type Impact How This Risk Manifests
Digital Assistant Low Humans review at each step, breaking the compounding chain.
Digital Apprentice Low Limited pipeline depth; compounding is bounded.
Autonomous Agent Medium Single autonomous agent; no compounding within agent. But if multiple agents coordinate, compounding is possible.
Delegating Agent High Single agent invoking multiple tools, but within-agent errors do not compound across agent boundaries.
Agent Crew / Pipeline Critical Multiple agents in sequence; compounding is the defining characteristic.
Agent Mesh / Swarm Critical Dynamic delegation can create complex handoff chains with significant compounding.

Regulatory Framework Mapping

Framework Coverage Citation What It Addresses What It Misses
ISO 42001 Partial Section 8.5, Performance monitoring and optimization AI system performance. End-to-end process sigma and compounding in multi-agent pipelines.
NIST AI RMF 1.0 Partial Performance monitoring of AI systems Performance monitoring. Pipeline-level sigma measurement.
Dodd-Frank Section 165 Addressed Operational resilience and effective controls Operational controls and effectiveness. Process quality degradation in multi-agent systems.
FFIEC Business Continuity Partial System performance and operational effectiveness Performance and effectiveness. Multi-agent pipeline sigma degradation.

Why This Matters in Regulated Industries

In regulated industries, process quality is non-negotiable. Regulators expect institutions to measure and control the quality of their operational processes. When an institution deploys a multi-agent pipeline without measuring end-to-end quality, it is operating blind. If errors accumulate at a higher rate than expected, regulators will cite this as a control failure.

The challenge is that compounding is subtle. Each individual agent is performing well. But the pipeline is not. Regulators will ask: "Why did you not measure end-to-end quality? Did you not anticipate that quality would compound across agents?"

Controls & Mitigations

Design-Time Controls

  • Before deploying a multi-agent pipeline, calculate the expected end-to-end sigma based on individual agent sigmas. If the pipeline has N agents, each at sigma S, the pipeline sigma is approximately S - log10(N).
  • If end-to-end sigma would be unacceptable, redesign the pipeline: reduce the number of agents, improve individual agent quality, or add human review checkpoints between agents to break the compounding chain.
  • Implement "quality buffer" stages: between high-error-potential agents, add a human review or validation stage. This breaks the compounding chain and allows early error detection.

Runtime Controls

  • Deploy end-to-end process sigma monitoring for all pipelines. Measure the sigma of the final output by tracking errors detected at the end of the pipeline or by customers.
  • Implement per-agent contribution monitoring: when an error is detected in the final output, trace the error back through the pipeline to identify which agent(s) contributed.
  • Implement quality gates between agents: after each agent in the pipeline, validate the output against the expected data specification. If output is invalid, escalate for manual intervention rather than passing to the next agent.

Detection & Response

  • Maintain a pipeline sigma dashboard that tracks end-to-end quality for each multi-agent pipeline. Escalate if pipeline sigma drops below target.
  • When errors are detected in pipeline outputs, perform root cause analysis to identify which agent(s) in the pipeline introduced the error.
  • Establish a threshold for pipeline rework: if rework costs (correcting errors detected after the pipeline) exceed 10% of pipeline throughput, redesign the pipeline to reduce compounding or add quality gates.

Related Risks

Address This Risk in Your Institution

Agent Sigma Compounding requires end-to-end pipeline quality measurement and quality gates that existing agent frameworks do not mandate. Our advisory engagements are purpose-built for banks, insurers, and financial institutions subject to prudential oversight.

Schedule a Briefing