R-QM-04 Quality & Measurement DAMAGE 3.6 / High

Measurement Absence

No sigma-level quality measurement exists for the agentic process. Organization cannot quantify whether the agent is operating at 2 sigma or 4 sigma.

The Risk

If you cannot measure quality, you cannot manage it. Traditional business processes are measured using Six Sigma metrics: first-pass yield, defects per million opportunities, cycle time, and more. These metrics allow organizations to understand the current state of the process, set targets, and track improvement.

Many organizations deploying agents do not measure agent quality using sigma methodology. They might measure "accuracy" (percentage of correct outputs) or "customer satisfaction," but these are not the same as sigma. An agent with 95% accuracy is operating at approximately 3.4 sigma (22,750 DPMO). An organization that reports "our agent is 95% accurate" may not realize that 95% accuracy is a mediocre performance level by Six Sigma standards.

The governance gap is: "We do not have a structured, measurable quality target for the agent. We assume the agent is good enough because humans reviewed it and thought it was fine. But we cannot quantify how good is 'good enough.'"

How It Materializes

A payments company deploys an agentic dispute resolution system to process customer disputes about transactions. The agent is designed to: (1) receive a dispute, (2) gather evidence from transaction logs and chargeback networks, (3) assess the likelihood that the customer's claim is valid, and (4) recommend a resolution (credit the customer's account, deny the dispute, or escalate for manual review).

The company's quality team tests the agent on a sample of 100 disputes with known outcomes (disputes that were previously resolved by human specialists). The agent's recommendations match the specialist's resolutions 88 times out of 100. The quality team concludes: "The agent is 88% accurate. This is acceptable; we will deploy it."

The company deploys the agent to production. The agent processes thousands of disputes per month. But the company has no ongoing quality measurement. The quality team does not track how often the agent's recommendations match the specialists' prior recommendations, or how often customers challenge the agent's resolutions.

After 6 months, a regulatory audit occurs. The regulator asks: "What is the sigma level of your dispute resolution process?" The company does not know. The regulator asks: "How many defects per million opportunities does your agent produce?" The company does not know. The regulator asks: "How do you know the agent is operating at an acceptable quality level?" The company responds: "We tested it and got 88% accuracy."

The regulator is not satisfied. Under consumer protection regulations, disputes must be resolved fairly and accurately. An 88% accuracy rate (3.1 sigma) means that 1 in 8 disputes may be resolved incorrectly. The regulator cites the company for inadequate quality measurement and inadequate controls.

The company is required to implement sigma-level quality measurement, establish a quality target, and measure against that target. Retroactive review of the agent's prior resolutions reveals that approximately 12% were potentially incorrect. The company must contact thousands of customers to inform them of potential errors and offer remediation.

DAMAGE Score Breakdown

Dimension Score Rationale
D - Detectability 5 If measurement does not exist, the gap is not detectable until a regulator or audit uncovers it.
A - Autonomy Sensitivity 4 Measurement absence is particularly acute for autonomous agents that operate without human oversight.
M - Multiplicative Potential 4 Without measurement, quality degradation can accumulate undetected.
A - Attack Surface 5 Any agent without explicit sigma measurement is exposed. Most agents fall into this category.
G - Governance Gap 5 This is the core governance gap: agent quality is not measured using the same rigor as business processes.
E - Enterprise Impact 4 Operating without measurement means the organization does not know whether the agent is operating acceptably, leading to regulatory violations and customer harm.
Composite DAMAGE Score 3.6 High. Requires immediate implementation of sigma-level quality measurement for all agent deployments.

Agent Impact Profile

How severity changes across the agent architecture spectrum.

Agent Type Impact How This Risk Manifests
Digital Assistant Low Humans review outputs and notice quality issues.
Digital Apprentice Medium Limited autonomy; quality issues are bounded.
Autonomous Agent Critical Autonomous decisions with no sigma measurement.
Delegating Agent Critical Dynamic invocation with no end-to-end sigma measurement.
Agent Crew / Pipeline Critical No measurement of pipeline sigma.
Agent Mesh / Swarm Critical Distributed operation with no coordinated quality measurement.

Regulatory Framework Mapping

Framework Coverage Citation What It Addresses What It Misses
NIST AI RMF 1.0 Addressed Performance monitoring and measurement of AI systems AI system measurement and monitoring. Specific requirement for sigma-level quality measurement.
ISO 42001 Partial Section 8.5, Performance monitoring and measurement Performance measurement. Sigma methodology and DPMO measurement.
Dodd-Frank Section 165 Addressed Effective risk management and controls Risk management effectiveness. Measurement of AI system quality.
GLBA Section 501 Addressed Safeguards and security of customer information and operations Operational safeguards. Measurement of automated decision quality.
GDPR Article 22 Addressed Right to explanation and oversight of automated decisions Oversight and explanation. Measurement of automated decision quality.

Why This Matters in Regulated Industries

Regulators increasingly expect institutions to apply the same governance rigor to AI systems as they do to other critical processes. If an institution measures its transaction processing at 6 sigma, it should also measure its AI systems at sigma. If it does not, regulators interpret this as inadequate governance.

The regulatory expectation is clear: any system that makes decisions affecting customers must have a defined quality target and measured performance against that target. Measurement absence means the institution is flying blind.

Controls & Mitigations

Design-Time Controls

  • Establish a quality target (sigma level) for every agent before deployment. Define the target based on the criticality of the agent's decisions and the risk tolerance of the organization. Recommend minimum 4 sigma for autonomous decisions in regulated contexts.
  • Implement measurement as a prerequisite for deployment: do not deploy an agent until you can measure its performance against the quality target.
  • Define success criteria for the deployment: e.g., "The agent must maintain 4 sigma quality for 90 days before we consider it production-ready."

Runtime Controls

  • Deploy quality measurement systems for every agent in production. Measure quality using DPMO, first-pass yield, accuracy, and other sigma metrics. Make these metrics visible to the agent governance team.
  • Establish automated alerts for quality degradation: if an agent's sigma level drops below the target, escalate immediately and reduce agent autonomy pending investigation.
  • Implement regular quality reviews (weekly or monthly): measure the sigma level of each agent, compare to target, and identify agents that are degrading.

Detection & Response

  • Maintain a quality dashboard visible to all stakeholders (operations, compliance, executive leadership). The dashboard shows each agent's sigma level and whether it meets the target.
  • When quality degrades, implement root cause analysis: has the agent's model drifted? Has input data quality degraded? Has the agent's reasoning become unstable?
  • Conduct quarterly quality audits: independently verify the sigma measurement methodology. Ensure that measurement is rigorous and not biased.

Related Risks

Address This Risk in Your Institution

Measurement Absence is a foundational governance gap that undermines all other quality controls. Our advisory engagements are purpose-built for banks, insurers, and financial institutions subject to prudential oversight.

Schedule a Briefing