R-QM-02 Quality & Measurement DAMAGE 3.5 / High

Process Sigma Degradation

Agent's output repeatability falls below acceptable threshold without detection. Same inputs produce different outputs with increasing frequency.

The Risk

A process operating at 4 sigma is consistent and repeatable. Given the same input, the process produces the same output >99% of the time. When a process degrades (e.g., to 3 sigma), the same input increasingly produces different outputs. This degradation might indicate that the process has drifted, that the process's assumptions are no longer valid, or that external conditions have changed.

Agents, implemented with large language models or other stochastic components, have inherent process variation. A query to an agent might produce slightly different outputs each time (due to model sampling, temperature settings, etc.). This variation is controlled within a tolerance; an agent at 4 sigma produces the same output >99% of the time despite slight differences in reasoning.

Process sigma degradation occurs when the agent's output variation increases beyond the controlled tolerance. The same input produces different outputs with increasing frequency. This might indicate model drift, that the agent's reasoning is becoming unstable, that the agent's training or context is out of date, or that external conditions have changed. Detection requires explicit monitoring of process repeatability.

How It Materializes

A financial services firm's agentic credit recommendation system is designed to score loan applications consistently. The agent receives an application and returns a credit recommendation (approve, decline, or refer for manual review) with a numeric confidence score. The agent is trained to achieve 99% consistency: identical inputs produce the same output recommendation with >99% probability.

For the first six months of deployment, the agent achieves 99.2% consistency. The firm monitors the consistency metric quarterly. In the second quarterly review (month 6), consistency is still 99.1%. In the third quarterly review (month 9), consistency has dropped to 98.8%. By month 12, consistency is 97.2% (dropping below the 4 sigma threshold into 3.5 sigma territory).

The firm's quality team does not notice this degradation because the monitoring dashboard tracks only the percent of consistent outputs, which is still "very high." The team does not receive an alert until consistency drops below 95% (which would be 3 sigma).

During month 12, the agent recommends approval for a risky loan. An identical loan application submitted the next day receives a decline recommendation. Both applications had the same income, credit score, debt-to-income ratio, and employment history. The only difference was the time of day they were processed. The human underwriter catches this inconsistency during a quality review and investigates.

The investigation reveals that the agent's model has drifted due to data distribution shift (the portfolio is skewing toward lower-income applicants, which the model was not optimized for). Additionally, the agent's prompt has been modified slightly (an engineer added context about the current market conditions), which changed the agent's reasoning.

Under fair lending regulations, loan approval decisions must be based on consistent evaluation of applicants. If an agent makes inconsistent recommendations (approve vs. decline for equivalent applications), this could indicate fair lending violations if the inconsistency correlates with protected characteristics.

DAMAGE Score Breakdown

Dimension Score Rationale
D - Detectability 4 Process sigma degradation is detectable through monitoring of output consistency, but only if organizations explicitly track repeatability. Standard monitoring focuses on output correctness, not consistency.
A - Autonomy Sensitivity 4 The risk manifests in autonomous agents whose decisions are not human-reviewed for consistency.
M - Multiplicative Potential 3 Each degraded process step compounds the degradation (if the agent's recommendation-making degrades, and the agent's reasoning degrades, the combined effect is worse).
A - Attack Surface 4 Any agent using stochastic models or generative components is exposed. As LLM-based agents become common, the surface expands.
G - Governance Gap 4 Standard quality frameworks focus on accuracy (are outputs correct?) but not on consistency (are identical inputs producing identical outputs?). Agent governance often lacks process sigma monitoring.
E - Enterprise Impact 4 Degraded process consistency results in inconsistent decisions, which can violate fair lending laws and create customer disputes.
Composite DAMAGE Score 3.5 High. Requires explicit process sigma monitoring and consistency thresholds for all agent deployments.

Agent Impact Profile

How severity changes across the agent architecture spectrum.

Agent Type Impact How This Risk Manifests
Digital Assistant Low Humans notice inconsistencies and compensate.
Digital Apprentice Medium Limited autonomy; inconsistent outputs are reviewed by humans.
Autonomous Agent High Autonomous decisions with degraded consistency go undetected.
Delegating Agent High Inconsistent tool invocations compound degradation.
Agent Crew / Pipeline Critical Inconsistency in one agent propagates through the pipeline.
Agent Mesh / Swarm Critical Peer-to-peer inconsistencies create unpredictable behavior.

Regulatory Framework Mapping

Framework Coverage Citation What It Addresses What It Misses
Fair Lending Laws (FHA, ECOA) Addressed Consistent and fair evaluation of credit applicants Consistency in credit decisions. Agent consistency degradation and fair lending impact.
SR 11-7 Addressed Model validation and performance monitoring Model performance; revalidation. Agent process sigma and consistency monitoring.
NIST AI RMF 1.0 Partial Performance and monitoring of AI systems Performance monitoring. Process sigma and consistency metrics.
Dodd-Frank Section 1681 Addressed Consistent treatment of consumers in credit decisions Consistency in decision-making. Agent consistency degradation.

Why This Matters in Regulated Industries

In regulated industries, consistency is a proxy for fairness and compliance. When an institution makes consistent decisions (even if they are not always correct), it demonstrates that it is applying a coherent policy. When decisions become inconsistent, regulators assume the institution has lost control of its decision-making process.

Fair lending regulations in particular emphasize consistency. If two applicants with identical qualifications receive different credit decisions, regulators ask: "How can the institution justify this difference?" If the institution cannot provide a consistent policy, it implies that the decision-making is arbitrary or discriminatory.

Controls & Mitigations

Design-Time Controls

  • Define a process sigma target for each agent and each decision type. Recommend minimum 4 sigma (99%+ consistency) for autonomous decisions, 3.5 sigma for human-supervised decisions.
  • Implement repeatability testing before agent deployment: run the agent on a fixed test set of 100 identical inputs multiple times and measure the percentage of outputs that are identical.
  • Document the stochastic components of the agent (model temperature, sampling strategy, random initialization) and establish controlled values. Ensure that agent behavior is repeatable given controlled settings.

Runtime Controls

  • Deploy process sigma monitoring: periodically (weekly), sample the agent's inputs and measure the consistency of outputs. Track the consistency trend and alert if consistency drops below the target threshold.
  • Implement a repeatability baseline: store a sample of agent outputs for high-volume decision types. Periodically re-process the same inputs and compare new outputs to baseline.
  • For agents using LLMs or generative models, implement temperature controls and seed management: use fixed seeds and controlled temperatures during production to ensure reproducibility.

Detection & Response

  • Maintain a process sigma dashboard visible to the quality and agent governance teams. Escalate immediately if consistency drops below the target.
  • When process sigma degradation is detected, investigate the root cause: has the model drifted? Has the agent's prompt been modified? Have external conditions changed?
  • Implement a consistency revalidation requirement: any time the agent's code, model, or prompt is modified, repeat the repeatability test. Do not deploy changes if consistency degrades.

Related Risks

Address This Risk in Your Institution

Process Sigma Degradation requires explicit consistency monitoring that goes beyond standard accuracy metrics. Our advisory engagements are purpose-built for banks, insurers, and financial institutions subject to prudential oversight.

Schedule a Briefing