Agent's output repeatability falls below acceptable threshold without detection. Same inputs produce different outputs with increasing frequency.
A process operating at 4 sigma is consistent and repeatable. Given the same input, the process produces the same output >99% of the time. When a process degrades (e.g., to 3 sigma), the same input increasingly produces different outputs. This degradation might indicate that the process has drifted, that the process's assumptions are no longer valid, or that external conditions have changed.
Agents, implemented with large language models or other stochastic components, have inherent process variation. A query to an agent might produce slightly different outputs each time (due to model sampling, temperature settings, etc.). This variation is controlled within a tolerance; an agent at 4 sigma produces the same output >99% of the time despite slight differences in reasoning.
Process sigma degradation occurs when the agent's output variation increases beyond the controlled tolerance. The same input produces different outputs with increasing frequency. This might indicate model drift, that the agent's reasoning is becoming unstable, that the agent's training or context is out of date, or that external conditions have changed. Detection requires explicit monitoring of process repeatability.
A financial services firm's agentic credit recommendation system is designed to score loan applications consistently. The agent receives an application and returns a credit recommendation (approve, decline, or refer for manual review) with a numeric confidence score. The agent is trained to achieve 99% consistency: identical inputs produce the same output recommendation with >99% probability.
For the first six months of deployment, the agent achieves 99.2% consistency. The firm monitors the consistency metric quarterly. In the second quarterly review (month 6), consistency is still 99.1%. In the third quarterly review (month 9), consistency has dropped to 98.8%. By month 12, consistency is 97.2% (dropping below the 4 sigma threshold into 3.5 sigma territory).
The firm's quality team does not notice this degradation because the monitoring dashboard tracks only the percent of consistent outputs, which is still "very high." The team does not receive an alert until consistency drops below 95% (which would be 3 sigma).
During month 12, the agent recommends approval for a risky loan. An identical loan application submitted the next day receives a decline recommendation. Both applications had the same income, credit score, debt-to-income ratio, and employment history. The only difference was the time of day they were processed. The human underwriter catches this inconsistency during a quality review and investigates.
The investigation reveals that the agent's model has drifted due to data distribution shift (the portfolio is skewing toward lower-income applicants, which the model was not optimized for). Additionally, the agent's prompt has been modified slightly (an engineer added context about the current market conditions), which changed the agent's reasoning.
Under fair lending regulations, loan approval decisions must be based on consistent evaluation of applicants. If an agent makes inconsistent recommendations (approve vs. decline for equivalent applications), this could indicate fair lending violations if the inconsistency correlates with protected characteristics.
| Dimension | Score | Rationale |
|---|---|---|
| D - Detectability | 4 | Process sigma degradation is detectable through monitoring of output consistency, but only if organizations explicitly track repeatability. Standard monitoring focuses on output correctness, not consistency. |
| A - Autonomy Sensitivity | 4 | The risk manifests in autonomous agents whose decisions are not human-reviewed for consistency. |
| M - Multiplicative Potential | 3 | Each degraded process step compounds the degradation (if the agent's recommendation-making degrades, and the agent's reasoning degrades, the combined effect is worse). |
| A - Attack Surface | 4 | Any agent using stochastic models or generative components is exposed. As LLM-based agents become common, the surface expands. |
| G - Governance Gap | 4 | Standard quality frameworks focus on accuracy (are outputs correct?) but not on consistency (are identical inputs producing identical outputs?). Agent governance often lacks process sigma monitoring. |
| E - Enterprise Impact | 4 | Degraded process consistency results in inconsistent decisions, which can violate fair lending laws and create customer disputes. |
| Composite DAMAGE Score | 3.5 | High. Requires explicit process sigma monitoring and consistency thresholds for all agent deployments. |
How severity changes across the agent architecture spectrum.
| Agent Type | Impact | How This Risk Manifests |
|---|---|---|
| Digital Assistant | Low | Humans notice inconsistencies and compensate. |
| Digital Apprentice | Medium | Limited autonomy; inconsistent outputs are reviewed by humans. |
| Autonomous Agent | High | Autonomous decisions with degraded consistency go undetected. |
| Delegating Agent | High | Inconsistent tool invocations compound degradation. |
| Agent Crew / Pipeline | Critical | Inconsistency in one agent propagates through the pipeline. |
| Agent Mesh / Swarm | Critical | Peer-to-peer inconsistencies create unpredictable behavior. |
| Framework | Coverage | Citation | What It Addresses | What It Misses |
|---|---|---|---|---|
| Fair Lending Laws (FHA, ECOA) | Addressed | Consistent and fair evaluation of credit applicants | Consistency in credit decisions. | Agent consistency degradation and fair lending impact. |
| SR 11-7 | Addressed | Model validation and performance monitoring | Model performance; revalidation. | Agent process sigma and consistency monitoring. |
| NIST AI RMF 1.0 | Partial | Performance and monitoring of AI systems | Performance monitoring. | Process sigma and consistency metrics. |
| Dodd-Frank Section 1681 | Addressed | Consistent treatment of consumers in credit decisions | Consistency in decision-making. | Agent consistency degradation. |
In regulated industries, consistency is a proxy for fairness and compliance. When an institution makes consistent decisions (even if they are not always correct), it demonstrates that it is applying a coherent policy. When decisions become inconsistent, regulators assume the institution has lost control of its decision-making process.
Fair lending regulations in particular emphasize consistency. If two applicants with identical qualifications receive different credit decisions, regulators ask: "How can the institution justify this difference?" If the institution cannot provide a consistent policy, it implies that the decision-making is arbitrary or discriminatory.
Process Sigma Degradation requires explicit consistency monitoring that goes beyond standard accuracy metrics. Our advisory engagements are purpose-built for banks, insurers, and financial institutions subject to prudential oversight.
Schedule a Briefing