R-FM-06 science Foundation Model & LLM DAMAGE 3.7 / High

Non-Determinism and Output Variance

Different outputs for identical inputs. Regulatory expectations for consistent treatment assume deterministic processing.

The Risk

Language models are non-deterministic: identical inputs produce different outputs on different inference runs. Even with temperature set to 0 (supposedly deterministic), some model implementations produce slightly different outputs due to floating-point arithmetic, hardware variations, or inference library randomness. With temperature >0 (normal operation), non-determinism is pronounced: the same prompt produces meaningfully different outputs.

This non-determinism is incompatible with regulatory expectations for consistent treatment. Financial regulation requires that identical situations be treated identically. Two customers with identical credit profiles should receive identical credit decisions. Regulators assume that if decisions are made consistently, bias is controlled. Non-determinism breaks this assumption: identical situations produce different decisions.

The non-determinism is also incompatible with audit and reproducibility expectations. Regulators expect to be able to examine a decision, understand the reasoning, and reproduce the decision for identical inputs. Non-determinism makes reproducibility impossible.

How It Materializes

A bank uses an agent to score customer credit applications. The agent's scoring process is non-deterministic due to temperature setting >0. Two customers with identical credit profiles apply within the same day. Customer A is scored 720 (approved). Customer B is scored 690 (declined). The customers are identical: same income, same debt, same credit history, same collateral.

Customer B requests explanation for the decline. The bank reviews the decision and discovers that Customer B's identical profile was scored differently than Customer A due to non-deterministic model output. The bank re-runs the agent's scoring for Customer B. This time, the score is 715 (approved). The second scoring is different from the first.

Customer B escalates to the regulator claiming discriminatory treatment. The regulator investigates. The regulator discovers that the bank's credit scoring is non-deterministic: identical inputs produce different outputs. The regulator is concerned that non-deterministic decisions are unfair. The regulator issues a finding that the bank's credit process has inadequate consistency controls.

The bank must modify the agent to enforce determinism (set temperature to 0, use deterministic sampling). But with temperature 0, the model's outputs become less diverse and sometimes less natural. The bank must redesign the process to maintain fairness while preserving output quality.

DAMAGE Score Breakdown

Dimension	Score	Rationale
D - Detectability	2	Non-determinism is apparent through simple testing: run same input twice, compare outputs. Easy to detect.
A - Autonomy Sensitivity	1	Non-determinism is inherent to LLM architecture; not dependent on autonomy.
M - Multiplicative Potential	4	Every decision the model makes is potentially non-deterministic. Affects all agent uses.
A - Attack Surface	3	Non-determinism is structural but could be exploited if adversary can observe multiple inferences and reverse-engineer favorable outputs.
G - Governance Gap	5	Regulatory frameworks assume deterministic decision-making. Non-determinism violates fundamental regulatory expectations.
E - Enterprise Impact	3	Fairness concerns, regulatory findings, process redesign required, but typically resolvable through determinism enforcement.
Composite DAMAGE Score	3.7	High. Requires priority attention and dedicated controls.

Agent Impact Profile

How severity changes across the agent architecture spectrum.

Agent Type	Impact	How This Risk Manifests
manage_accounts Digital Assistant	Moderate	Human may notice inconsistent outputs and question reliability.
school Digital Apprentice	Moderate	Agent produces inconsistent recommendations; apparent unreliability.
smart_toy Autonomous Agent	High	Autonomous agent produces inconsistent decisions affecting business operations.
share Delegating Agent	High	Agent's delegations produce inconsistent outcomes. Downstream systems receive inconsistent recommendations.
groups Agent Crew / Pipeline	Critical	Multiple agents with non-deterministic outputs compound inconsistency through pipeline.
account_tree Agent Mesh / Swarm	Critical	Peer-to-peer agent network with non-deterministic behavior. Systemic inconsistency.

Regulatory Framework Mapping

Framework	Coverage	Citation	What It Addresses	What It Misses
ECOA	Partial	15 U.S.C. 1691	Requires consistent treatment in credit decisions.	Does not specifically address non-deterministic model outputs.
Fair Housing Act	Partial	42 U.S.C. 3604	Requires consistent treatment in housing-related decisions.	Does not address non-determinism.
GDPR Article 22	Partial	Right to Explanation	Requires meaningful information about logic of automated decisions.	Does not address non-determinism or output variance.
FCA Handbook	Partial	COBS 2.2R	Requires fairness in customer treatment.	Does not address model non-determinism.
NIST AI RMF 1.0	Partial	MAP 2.1 (Testing)	Recommends testing and validation.	Does not specifically address non-determinism testing.

Why This Matters in Regulated Industries

In credit, insurance, and employment decisions, consistent treatment is a legal requirement. Non-deterministic model outputs violate these requirements by design. Regulators expect institutions to enforce determinism in consequential decisions. An institution that uses non-deterministic models for material decisions without adequate controls violates fair lending/treatment principles.

Additionally, non-determinism undermines institutional credibility. If an institution cannot produce consistent decisions, customers and regulators lose confidence in the institution's fairness and reliability. An institution that cannot defend its decisions to regulators because the decisions were non-deterministic faces enforcement action and reputational damage.

Controls & Mitigations

architectureDesign-Time Controls

For any agent making consequential decisions, enforce determinism: set model temperature to 0, use deterministic sampling, set random seeds explicitly.
Document determinism requirements for each agent: specify whether determinism is required, what temperature or sampling strategy is used, and why.
Test determinism: for each agent, run identical inputs 100 times, verify outputs are identical. Document determinism validation.
For agents where non-determinism is intentionally desired, implement determinism controls for critical decision points.

play_circleRuntime Controls

Seed random number generators explicitly at agent startup. Use the same seed for all inference runs.
Log all inputs and outputs: maintain complete records of all agent interactions so decisions can be replayed and verified for determinism.
Use Component 2 (Cryptographic Identity) to sign outputs: for identical inputs, outputs must produce identical signatures. Monitor for signature mismatches.
Use Component 10 (Kill Switch) to halt agents whose outputs are non-deterministic for what should be deterministic decisions.

monitoringDetection & Response

Conduct quarterly determinism testing: for each agent, run standard test inputs 100 times, verify identical outputs every time.
Monitor for inconsistent decisions: track decisions made by agents, identify cases where identical inputs produced different decisions.
Audit consistency: periodically compare decisions for identical customers or transactions. Flag inconsistencies.
Establish incident response for detected non-determinism: assess scope of inconsistent decisions, determine fairness impact, implement determinism controls.

Related Risks

Address This Risk in Your Institution

Non-Determinism and Output Variance requires architectural controls that go beyond what existing frameworks provide. Our advisory engagements are purpose-built for banks, insurers, and financial institutions subject to prudential oversight.

Schedule a Briefing

Agentic AI Risk & Controls Workshop Our Methodology Regulatory Landscape