Different outputs for identical inputs. Regulatory expectations for consistent treatment assume deterministic processing.
Language models are non-deterministic: identical inputs produce different outputs on different inference runs. Even with temperature set to 0 (supposedly deterministic), some model implementations produce slightly different outputs due to floating-point arithmetic, hardware variations, or inference library randomness. With temperature >0 (normal operation), non-determinism is pronounced: the same prompt produces meaningfully different outputs.
This non-determinism is incompatible with regulatory expectations for consistent treatment. Financial regulation requires that identical situations be treated identically. Two customers with identical credit profiles should receive identical credit decisions. Regulators assume that if decisions are made consistently, bias is controlled. Non-determinism breaks this assumption: identical situations produce different decisions.
The non-determinism is also incompatible with audit and reproducibility expectations. Regulators expect to be able to examine a decision, understand the reasoning, and reproduce the decision for identical inputs. Non-determinism makes reproducibility impossible.
A bank uses an agent to score customer credit applications. The agent's scoring process is non-deterministic due to temperature setting >0. Two customers with identical credit profiles apply within the same day. Customer A is scored 720 (approved). Customer B is scored 690 (declined). The customers are identical: same income, same debt, same credit history, same collateral.
Customer B requests explanation for the decline. The bank reviews the decision and discovers that Customer B's identical profile was scored differently than Customer A due to non-deterministic model output. The bank re-runs the agent's scoring for Customer B. This time, the score is 715 (approved). The second scoring is different from the first.
Customer B escalates to the regulator claiming discriminatory treatment. The regulator investigates. The regulator discovers that the bank's credit scoring is non-deterministic: identical inputs produce different outputs. The regulator is concerned that non-deterministic decisions are unfair. The regulator issues a finding that the bank's credit process has inadequate consistency controls.
The bank must modify the agent to enforce determinism (set temperature to 0, use deterministic sampling). But with temperature 0, the model's outputs become less diverse and sometimes less natural. The bank must redesign the process to maintain fairness while preserving output quality.
| Dimension | Score | Rationale |
|---|---|---|
| D - Detectability | 2 | Non-determinism is apparent through simple testing: run same input twice, compare outputs. Easy to detect. |
| A - Autonomy Sensitivity | 1 | Non-determinism is inherent to LLM architecture; not dependent on autonomy. |
| M - Multiplicative Potential | 4 | Every decision the model makes is potentially non-deterministic. Affects all agent uses. |
| A - Attack Surface | 3 | Non-determinism is structural but could be exploited if adversary can observe multiple inferences and reverse-engineer favorable outputs. |
| G - Governance Gap | 5 | Regulatory frameworks assume deterministic decision-making. Non-determinism violates fundamental regulatory expectations. |
| E - Enterprise Impact | 3 | Fairness concerns, regulatory findings, process redesign required, but typically resolvable through determinism enforcement. |
| Composite DAMAGE Score | 3.7 | High. Requires priority attention and dedicated controls. |
How severity changes across the agent architecture spectrum.
| Agent Type | Impact | How This Risk Manifests |
|---|---|---|
| Digital Assistant | Moderate | Human may notice inconsistent outputs and question reliability. |
| Digital Apprentice | Moderate | Agent produces inconsistent recommendations; apparent unreliability. |
| Autonomous Agent | High | Autonomous agent produces inconsistent decisions affecting business operations. |
| Delegating Agent | High | Agent's delegations produce inconsistent outcomes. Downstream systems receive inconsistent recommendations. |
| Agent Crew / Pipeline | Critical | Multiple agents with non-deterministic outputs compound inconsistency through pipeline. |
| Agent Mesh / Swarm | Critical | Peer-to-peer agent network with non-deterministic behavior. Systemic inconsistency. |
| Framework | Coverage | Citation | What It Addresses | What It Misses |
|---|---|---|---|---|
| ECOA | Partial | 15 U.S.C. 1691 | Requires consistent treatment in credit decisions. | Does not specifically address non-deterministic model outputs. |
| Fair Housing Act | Partial | 42 U.S.C. 3604 | Requires consistent treatment in housing-related decisions. | Does not address non-determinism. |
| GDPR Article 22 | Partial | Right to Explanation | Requires meaningful information about logic of automated decisions. | Does not address non-determinism or output variance. |
| FCA Handbook | Partial | COBS 2.2R | Requires fairness in customer treatment. | Does not address model non-determinism. |
| NIST AI RMF 1.0 | Partial | MAP 2.1 (Testing) | Recommends testing and validation. | Does not specifically address non-determinism testing. |
In credit, insurance, and employment decisions, consistent treatment is a legal requirement. Non-deterministic model outputs violate these requirements by design. Regulators expect institutions to enforce determinism in consequential decisions. An institution that uses non-deterministic models for material decisions without adequate controls violates fair lending/treatment principles.
Additionally, non-determinism undermines institutional credibility. If an institution cannot produce consistent decisions, customers and regulators lose confidence in the institution's fairness and reliability. An institution that cannot defend its decisions to regulators because the decisions were non-deterministic faces enforcement action and reputational damage.
Non-Determinism and Output Variance requires architectural controls that go beyond what existing frameworks provide. Our advisory engagements are purpose-built for banks, insurers, and financial institutions subject to prudential oversight.
Schedule a Briefing