R-MP-06 Model & Pipeline Interaction DAMAGE 4.1 / Critical

Model Output as Ground Truth

Agents consume model outputs as inputs and treat them with the same confidence as system-of-record data. The model's confidence intervals evaporate in the agent's prose.

The Risk

Machine learning models produce predictions with inherent uncertainty. A credit scoring model outputs a score of 650, but the model's uncertainty interval is +/- 30 points (due to variation in the training data, the stochastic nature of the model, etc.). A fraud detection model outputs a probability of 0.72, but this reflects the model's training accuracy on the validation set, not absolute certainty.

Agents consuming model outputs often fail to preserve uncertainty information. The agent receives a credit score of 650 and an uncertainty interval of +/- 30, but when the agent constructs natural language explanations or recommendations, it says "The customer's credit score is 650," omitting the uncertainty. The downstream human reviewer reads the agent's output and interprets the score as certain, not as a range from 620 to 680.

Additionally, agents may treat model outputs as ground truth (the "true" value), rather than as predictions (estimates with uncertainty). An agent might receive a fraud probability from a model and immediately escalate the transaction as "fraudulent" without recognizing that a 72% fraud probability is still a 28% chance of being legitimate. The agent's prose might say "The transaction is fraudulent" instead of "The transaction has a 72% likelihood of being fraudulent based on model X."

This loss of uncertainty information is problematic because downstream decisions are made with less information than the model actually provides. Additionally, it creates audit trail problems: if a human reviewer later asks "Why was this transaction escalated?", the agent's explanation says "It is fraudulent" rather than "It has elevated fraud risk based on feature variations."

How It Materializes

A payment processor deploys an agent to monitor transactions for fraud in real-time. The agent is authorized to: (1) score each transaction with the fraud detection model, (2) retrieve the model's confidence interval, (3) assess whether the score warrants escalation, and (4) escalate high-risk transactions to the manual review queue.

The model provides a fraud probability and a 95% confidence interval. A transaction receives a fraud probability of 0.72 with a confidence interval of [0.55, 0.89]. This means the model is 95% confident that the true fraud probability is between 55% and 89%. The transaction is moderately risky, but far from certain.

The agent retrieves this output and constructs a message for the manual review team: "High-risk transaction detected: Transaction ID 12345 is fraudulent (model score 0.72). Recommend denial." The agent has omitted the confidence interval and has phrased the model's prediction as a definitive fact.

The manual reviewer receives the agent's message and quickly reviews the transaction. The message says "is fraudulent," so the reviewer flags the transaction as denied without further investigation. The transaction is blocked.

Hours later, the cardholder calls the payment processor complaining about a declined transaction. The processor investigates and discovers that the transaction was actually legitimate (the cardholder confirms they made the purchase). The fraud model's high score (0.72) was a false positive; the model's confidence interval (which included as low as 0.55) was broader than the agent's language suggested.

Under consumer protection regulations, the processor has wrongly denied a legitimate transaction. The cardholder can dispute the decision and demand remediation. The processor's defense is: "The model flagged it as high-risk," but the regulator asks: "What was the model's uncertainty? Did the agent communicate uncertainty to the reviewer? Did the reviewer understand that the model's probability was not a certainty?"

If the processor did not communicate uncertainty, regulators cite this as inadequate consumer protection and customer notification. The processor may be required to compensate the customer for the wrongful denial.

DAMAGE Score Breakdown

Dimension Score Rationale
D - Detectability 2 Loss of uncertainty information is not detectable in the agent's output alone. Detection requires comparing the agent's explanation to the original model output and checking whether uncertainty was preserved.
A - Autonomy Sensitivity 3 Both autonomous and supervised agents can drop uncertainty, though autonomous agents are more likely to make decisions based on incomplete information.
M - Multiplicative Potential 3 Each agent output that drops uncertainty compounds the risk. But the impact is usually localized to the agent's recommendation.
A - Attack Surface 5 Nearly all agents that consume model outputs are exposed. Preserving uncertainty information requires explicit design.
G - Governance Gap 5 Agent governance does not mandate that agents preserve and communicate model uncertainty. Model governance does not mandate that models communicate uncertainty to downstream agents.
E - Enterprise Impact 3 False positives (high-risk when actually low-risk) result in customer impact and regulatory exposure. But the scope is usually bounded to individual decisions.
Composite DAMAGE Score 4.1 Critical. Requires mandatory uncertainty preservation in all agent outputs consuming model predictions.

Agent Impact Profile

How severity changes across the agent architecture spectrum.

Agent Type Impact How This Risk Manifests
Digital Assistant Low Human expert recognizes when the agent is overconfident.
Digital Apprentice Medium Limited autonomy; decisions are human-reviewed and uncertainty can be restored.
Autonomous Agent High Autonomous decisions based on uncertain model outputs without explicit uncertainty communication.
Delegating Agent High Invokes models via APIs; uncertainty information may not be propagated through function calls.
Agent Crew / Pipeline High Uncertainty is lost at each agent handoff.
Agent Mesh / Swarm High Peer-to-peer model consumption without uncertainty preservation.

Regulatory Framework Mapping

Framework Coverage Citation What It Addresses What It Misses
GDPR Article 14 Partial Right to explanation of automated decisions Explanation of automated decisions. Model uncertainty in explanations provided to subjects.
FCRA Partial Accuracy and explanation of credit decisions Credit decision accuracy; consumer notification. Model uncertainty in credit decisions.
EFTA Regulation E Partial Electronic funds transfer disputes and error resolution Dispute resolution for electronic transactions. Uncertainty in fraud detection decisions.
SR 11-7 Partial Model governance and validation Model validation and performance. Agent preservation of model uncertainty.
NIST AI RMF 1.0 Partial Govern and Protect Functions AI governance. Agent preservation of model uncertainty in outputs.

Why This Matters in Regulated Industries

In regulated industries, decision transparency is critical. When an institution makes a decision that affects a customer (decline a loan, deny a transaction, flag for fraud investigation), the institution must be able to explain the decision. If the explanation is based on a model output, the explanation should preserve the model's uncertainty.

Consumers have rights to explanation (GDPR Article 14, FCRA, etc.). If a consumer asks "Why was my transaction declined?" and the institution responds "The fraud detection model flagged it," the consumer might ask "What is the model's confidence? Could it be wrong?" If the institution cannot answer with confidence intervals or uncertainty ranges, it suggests the institution did not understand the model's limitations.

Additionally, if the agent's language ("The transaction is fraudulent") is stronger than the model's actual confidence ("72% probability"), and a consumer is later harmed by the decision, regulators will investigate whether the institution overconfidently applied the model.

Controls & Mitigations

Design-Time Controls

  • Mandate that all models used by agents output confidence intervals, uncertainty estimates, or calibration statistics. Agents are not authorized to consume a model unless the model provides uncertainty information.
  • Implement a transparency requirement for agents: when an agent makes a recommendation based on a model, the output must include the model's uncertainty. The agent must say "probability X" not "is Y."
  • Design agent output templates that explicitly include uncertainty language. Instead of "The customer is creditworthy," templates say "The customer's credit risk is [score] with 95% confidence interval [lower, upper]."

Runtime Controls

  • Validate that agent outputs preserve model uncertainty: use NLP analysis to check whether agent explanations include uncertainty language (probability, range, confidence interval). Flag explanations that omit uncertainty for human review.
  • Implement model output passthrough: when an agent makes a decision based on a model, append the full model output (including confidence interval) to the decision record, even if the agent's explanation is brief.
  • Monitor human reviewer decisions after agent recommendations: if reviewers systematically override agent decisions, investigate whether the agent's explanations were overconfident or lacked uncertainty.

Detection & Response

  • Audit a sample of agent explanations quarterly. Check whether uncertainty information is present and accurate. If uncertainty information is missing or misrepresented, retraining is required.
  • When a customer disputes a decision (e.g., "I shouldn't have been declined"), investigate whether the agent's explanation preserved model uncertainty. If not, this is evidence of inadequate agent governance.
  • Track false positive and false negative rates for agent decisions by confidence level. If an agent with 72% confidence fraud probability has a high false positive rate, this indicates the agent is not respecting the model's calibration.

Related Risks

Address This Risk in Your Institution

Model Output as Ground Truth requires mandatory uncertainty preservation in all agent outputs. Our advisory engagements are purpose-built for banks, insurers, and financial institutions subject to prudential oversight.

Schedule a Briefing