R-QM-04 show_chart Quality & Measurement DAMAGE 3.6 / High

Measurement Absence

No sigma-level quality measurement exists for the agentic process. Organization cannot quantify whether the agent is operating at 2 sigma or 4 sigma.

The Risk

If you cannot measure quality, you cannot manage it. Traditional business processes are measured using Six Sigma metrics: first-pass yield, defects per million opportunities, cycle time, and more. These metrics allow organizations to understand the current state of the process, set targets, and track improvement.

Many organizations deploying agents do not measure agent quality using sigma methodology. They might measure "accuracy" (percentage of correct outputs) or "customer satisfaction," but these are not the same as sigma. An agent with 95% accuracy is operating at approximately 3.4 sigma (22,750 DPMO). An organization that reports "our agent is 95% accurate" may not realize that 95% accuracy is a mediocre performance level by Six Sigma standards.

The governance gap is: "We do not have a structured, measurable quality target for the agent. We assume the agent is good enough because humans reviewed it and thought it was fine. But we cannot quantify how good is 'good enough.'"

How It Materializes

A payments company deploys an agentic dispute resolution system to process customer disputes about transactions. The agent is designed to: (1) receive a dispute, (2) gather evidence from transaction logs and chargeback networks, (3) assess the likelihood that the customer's claim is valid, and (4) recommend a resolution (credit the customer's account, deny the dispute, or escalate for manual review).

The company's quality team tests the agent on a sample of 100 disputes with known outcomes (disputes that were previously resolved by human specialists). The agent's recommendations match the specialist's resolutions 88 times out of 100. The quality team concludes: "The agent is 88% accurate. This is acceptable; we will deploy it."

The company deploys the agent to production. The agent processes thousands of disputes per month. But the company has no ongoing quality measurement. The quality team does not track how often the agent's recommendations match the specialists' prior recommendations, or how often customers challenge the agent's resolutions.

After 6 months, a regulatory audit occurs. The regulator asks: "What is the sigma level of your dispute resolution process?" The company does not know. The regulator asks: "How many defects per million opportunities does your agent produce?" The company does not know. The regulator asks: "How do you know the agent is operating at an acceptable quality level?" The company responds: "We tested it and got 88% accuracy."

The regulator is not satisfied. Under consumer protection regulations, disputes must be resolved fairly and accurately. An 88% accuracy rate (3.1 sigma) means that 1 in 8 disputes may be resolved incorrectly. The regulator cites the company for inadequate quality measurement and inadequate controls.

The company is required to implement sigma-level quality measurement, establish a quality target, and measure against that target. Retroactive review of the agent's prior resolutions reveals that approximately 12% were potentially incorrect. The company must contact thousands of customers to inform them of potential errors and offer remediation.

DAMAGE Score Breakdown

Dimension	Score	Rationale
D - Detectability	5	If measurement does not exist, the gap is not detectable until a regulator or audit uncovers it.
A - Autonomy Sensitivity	4	Measurement absence is particularly acute for autonomous agents that operate without human oversight.
M - Multiplicative Potential	4	Without measurement, quality degradation can accumulate undetected.
A - Attack Surface	5	Any agent without explicit sigma measurement is exposed. Most agents fall into this category.
G - Governance Gap	5	This is the core governance gap: agent quality is not measured using the same rigor as business processes.
E - Enterprise Impact	4	Operating without measurement means the organization does not know whether the agent is operating acceptably, leading to regulatory violations and customer harm.
Composite DAMAGE Score	3.6	High. Requires immediate implementation of sigma-level quality measurement for all agent deployments.

Agent Impact Profile

How severity changes across the agent architecture spectrum.

Agent Type	Impact	How This Risk Manifests
manage_accounts Digital Assistant	Low	Humans review outputs and notice quality issues.
school Digital Apprentice	Medium	Limited autonomy; quality issues are bounded.
smart_toy Autonomous Agent	Critical	Autonomous decisions with no sigma measurement.
share Delegating Agent	Critical	Dynamic invocation with no end-to-end sigma measurement.
groups Agent Crew / Pipeline	Critical	No measurement of pipeline sigma.
account_tree Agent Mesh / Swarm	Critical	Distributed operation with no coordinated quality measurement.

Regulatory Framework Mapping

Framework	Coverage	Citation	What It Addresses	What It Misses
NIST AI RMF 1.0	Addressed	Performance monitoring and measurement of AI systems	AI system measurement and monitoring.	Specific requirement for sigma-level quality measurement.
ISO 42001	Partial	Section 8.5, Performance monitoring and measurement	Performance measurement.	Sigma methodology and DPMO measurement.
Dodd-Frank Section 165	Addressed	Effective risk management and controls	Risk management effectiveness.	Measurement of AI system quality.
GLBA Section 501	Addressed	Safeguards and security of customer information and operations	Operational safeguards.	Measurement of automated decision quality.
GDPR Article 22	Addressed	Right to explanation and oversight of automated decisions	Oversight and explanation.	Measurement of automated decision quality.

Why This Matters in Regulated Industries

Regulators increasingly expect institutions to apply the same governance rigor to AI systems as they do to other critical processes. If an institution measures its transaction processing at 6 sigma, it should also measure its AI systems at sigma. If it does not, regulators interpret this as inadequate governance.

The regulatory expectation is clear: any system that makes decisions affecting customers must have a defined quality target and measured performance against that target. Measurement absence means the institution is flying blind.

Controls & Mitigations

architectureDesign-Time Controls

Establish a quality target (sigma level) for every agent before deployment. Define the target based on the criticality of the agent's decisions and the risk tolerance of the organization. Recommend minimum 4 sigma for autonomous decisions in regulated contexts.
Implement measurement as a prerequisite for deployment: do not deploy an agent until you can measure its performance against the quality target.
Define success criteria for the deployment: e.g., "The agent must maintain 4 sigma quality for 90 days before we consider it production-ready."

play_circleRuntime Controls

Deploy quality measurement systems for every agent in production. Measure quality using DPMO, first-pass yield, accuracy, and other sigma metrics. Make these metrics visible to the agent governance team.
Establish automated alerts for quality degradation: if an agent's sigma level drops below the target, escalate immediately and reduce agent autonomy pending investigation.
Implement regular quality reviews (weekly or monthly): measure the sigma level of each agent, compare to target, and identify agents that are degrading.

monitoringDetection & Response

Maintain a quality dashboard visible to all stakeholders (operations, compliance, executive leadership). The dashboard shows each agent's sigma level and whether it meets the target.
When quality degrades, implement root cause analysis: has the agent's model drifted? Has input data quality degraded? Has the agent's reasoning become unstable?
Conduct quarterly quality audits: independently verify the sigma measurement methodology. Ensure that measurement is rigorous and not biased.

Address This Risk in Your Institution

Measurement Absence is a foundational governance gap that undermines all other quality controls. Our advisory engagements are purpose-built for banks, insurers, and financial institutions subject to prudential oversight.

Schedule a Briefing

Agentic AI Risk & Controls Workshop Our Methodology Regulatory Landscape