R-FM-06 Foundation Model & LLM DAMAGE 3.7 / High

Non-Determinism and Output Variance

Different outputs for identical inputs. Regulatory expectations for consistent treatment assume deterministic processing.

The Risk

Language models are non-deterministic: identical inputs produce different outputs on different inference runs. Even with temperature set to 0 (supposedly deterministic), some model implementations produce slightly different outputs due to floating-point arithmetic, hardware variations, or inference library randomness. With temperature >0 (normal operation), non-determinism is pronounced: the same prompt produces meaningfully different outputs.

This non-determinism is incompatible with regulatory expectations for consistent treatment. Financial regulation requires that identical situations be treated identically. Two customers with identical credit profiles should receive identical credit decisions. Regulators assume that if decisions are made consistently, bias is controlled. Non-determinism breaks this assumption: identical situations produce different decisions.

The non-determinism is also incompatible with audit and reproducibility expectations. Regulators expect to be able to examine a decision, understand the reasoning, and reproduce the decision for identical inputs. Non-determinism makes reproducibility impossible.

How It Materializes

A bank uses an agent to score customer credit applications. The agent's scoring process is non-deterministic due to temperature setting >0. Two customers with identical credit profiles apply within the same day. Customer A is scored 720 (approved). Customer B is scored 690 (declined). The customers are identical: same income, same debt, same credit history, same collateral.

Customer B requests explanation for the decline. The bank reviews the decision and discovers that Customer B's identical profile was scored differently than Customer A due to non-deterministic model output. The bank re-runs the agent's scoring for Customer B. This time, the score is 715 (approved). The second scoring is different from the first.

Customer B escalates to the regulator claiming discriminatory treatment. The regulator investigates. The regulator discovers that the bank's credit scoring is non-deterministic: identical inputs produce different outputs. The regulator is concerned that non-deterministic decisions are unfair. The regulator issues a finding that the bank's credit process has inadequate consistency controls.

The bank must modify the agent to enforce determinism (set temperature to 0, use deterministic sampling). But with temperature 0, the model's outputs become less diverse and sometimes less natural. The bank must redesign the process to maintain fairness while preserving output quality.

DAMAGE Score Breakdown

DimensionScoreRationale
D - Detectability2Non-determinism is apparent through simple testing: run same input twice, compare outputs. Easy to detect.
A - Autonomy Sensitivity1Non-determinism is inherent to LLM architecture; not dependent on autonomy.
M - Multiplicative Potential4Every decision the model makes is potentially non-deterministic. Affects all agent uses.
A - Attack Surface3Non-determinism is structural but could be exploited if adversary can observe multiple inferences and reverse-engineer favorable outputs.
G - Governance Gap5Regulatory frameworks assume deterministic decision-making. Non-determinism violates fundamental regulatory expectations.
E - Enterprise Impact3Fairness concerns, regulatory findings, process redesign required, but typically resolvable through determinism enforcement.
Composite DAMAGE Score3.7High. Requires priority attention and dedicated controls.

Agent Impact Profile

How severity changes across the agent architecture spectrum.

Agent TypeImpactHow This Risk Manifests
Digital AssistantModerateHuman may notice inconsistent outputs and question reliability.
Digital ApprenticeModerateAgent produces inconsistent recommendations; apparent unreliability.
Autonomous AgentHighAutonomous agent produces inconsistent decisions affecting business operations.
Delegating AgentHighAgent's delegations produce inconsistent outcomes. Downstream systems receive inconsistent recommendations.
Agent Crew / PipelineCriticalMultiple agents with non-deterministic outputs compound inconsistency through pipeline.
Agent Mesh / SwarmCriticalPeer-to-peer agent network with non-deterministic behavior. Systemic inconsistency.

Regulatory Framework Mapping

FrameworkCoverageCitationWhat It AddressesWhat It Misses
ECOAPartial15 U.S.C. 1691Requires consistent treatment in credit decisions.Does not specifically address non-deterministic model outputs.
Fair Housing ActPartial42 U.S.C. 3604Requires consistent treatment in housing-related decisions.Does not address non-determinism.
GDPR Article 22PartialRight to ExplanationRequires meaningful information about logic of automated decisions.Does not address non-determinism or output variance.
FCA HandbookPartialCOBS 2.2RRequires fairness in customer treatment.Does not address model non-determinism.
NIST AI RMF 1.0PartialMAP 2.1 (Testing)Recommends testing and validation.Does not specifically address non-determinism testing.

Why This Matters in Regulated Industries

In credit, insurance, and employment decisions, consistent treatment is a legal requirement. Non-deterministic model outputs violate these requirements by design. Regulators expect institutions to enforce determinism in consequential decisions. An institution that uses non-deterministic models for material decisions without adequate controls violates fair lending/treatment principles.

Additionally, non-determinism undermines institutional credibility. If an institution cannot produce consistent decisions, customers and regulators lose confidence in the institution's fairness and reliability. An institution that cannot defend its decisions to regulators because the decisions were non-deterministic faces enforcement action and reputational damage.

Controls & Mitigations

Design-Time Controls

  • For any agent making consequential decisions, enforce determinism: set model temperature to 0, use deterministic sampling, set random seeds explicitly.
  • Document determinism requirements for each agent: specify whether determinism is required, what temperature or sampling strategy is used, and why.
  • Test determinism: for each agent, run identical inputs 100 times, verify outputs are identical. Document determinism validation.
  • For agents where non-determinism is intentionally desired, implement determinism controls for critical decision points.

Runtime Controls

  • Seed random number generators explicitly at agent startup. Use the same seed for all inference runs.
  • Log all inputs and outputs: maintain complete records of all agent interactions so decisions can be replayed and verified for determinism.
  • Use Component 2 (Cryptographic Identity) to sign outputs: for identical inputs, outputs must produce identical signatures. Monitor for signature mismatches.
  • Use Component 10 (Kill Switch) to halt agents whose outputs are non-deterministic for what should be deterministic decisions.

Detection & Response

  • Conduct quarterly determinism testing: for each agent, run standard test inputs 100 times, verify identical outputs every time.
  • Monitor for inconsistent decisions: track decisions made by agents, identify cases where identical inputs produced different decisions.
  • Audit consistency: periodically compare decisions for identical customers or transactions. Flag inconsistencies.
  • Establish incident response for detected non-determinism: assess scope of inconsistent decisions, determine fairness impact, implement determinism controls.

Related Risks

Address This Risk in Your Institution

Non-Determinism and Output Variance requires architectural controls that go beyond what existing frameworks provide. Our advisory engagements are purpose-built for banks, insurers, and financial institutions subject to prudential oversight.

Schedule a Briefing