R-FM-09 science Foundation Model & LLM DAMAGE 3.8 / High

Persistent Memory Degradation

Agent memory stores grow through normal operation without expiration, validation, or reconciliation against sources of record. The memory grows but its accuracy decays.

The Risk

Some agents maintain long-term memory: previous conversations, learned facts, accumulated knowledge. Memory is stored in embeddings, semantic stores, or vector databases. Over time, this memory accumulates errors through normal operation. An agent makes a mistake in one conversation; the mistake is stored in memory. In a subsequent conversation, the agent retrieves the mistake and reasons about it as if it were a fact. New agents trained or fine-tuned on memory data inherit the errors.

This creates a memory degradation risk: the institution stores what it believes to be accurate historical data in memory stores, but the memory stores are actually a mixture of ground truth and accumulated errors. The institution cannot distinguish between them because the memory is opaque. As the institution relies more on memory (because retrieving from memory is cheaper than re-reasoning), the reliance on error-contaminated memory increases.

This is particularly dangerous in systems with human-in-the-loop memory management. A human might correct an obvious error in memory, but subtle errors (a statement that is plausible but slightly incorrect) accumulate undetected. The human cannot realistically audit all memory; the volume is too large.

How It Materializes

A financial services institution uses agents to maintain customer relationship histories. Each customer has a memory store: prior conversations, known preferences, historical context, business relationship data. The memory is maintained using embeddings and stored in a vector database.

Over six months, the agent for customer ABC accumulates the following memory: Customer ABC is a manufacturing company (TRUE). Customer ABC has annual revenue of approximately $50M (TRUE, as of last known update). Customer ABC is planning expansion into Asia (STATED in prior conversation but later cancelled). Customer ABC experienced supply chain disruption in Q2 (TRUE). Customer ABC's CFO is John Smith (FALSE: John Smith left, replaced by Sarah Chen). Customer ABC's priority is cost reduction (INFERRED from one conversation, actually secondary priority).

The memory contains a mixture of current facts, outdated facts, never-confirmed inferences, and errors. The agent does not distinguish between categories. When the agent interacts with customer ABC, it retrieves the memory and treats all memories equally. The agent believes the CFO is John Smith, causes confusion when the contact is Sarah Chen. The agent believes expansion into Asia is planned, proposes solutions for Asian operations.

The institution discovers the memory degradation only when the inaccuracy causes a business impact. By that time, no one remembers which memories are original facts and which are accumulated errors.

DAMAGE Score Breakdown

Dimension	Score	Rationale
D - Detectability	3	Memory degradation is gradual; discovery occurs through business impact or explicit memory audit.
A - Autonomy Sensitivity	4	More autonomous agents accumulate memory longer without human correction.
M - Multiplicative Potential	4	Each error added to memory can be retrieved and propagated. Compounds over months/years.
A - Attack Surface	3	Adversary could intentionally contaminate memory to mislead agents. But degradation occurs naturally.
G - Governance Gap	5	Data governance frameworks assume stored data is accurate. Persistent memory violates this assumption.
E - Enterprise Impact	2	Memory contamination produces degraded decisions, but impact is localized to specific agent/customer. Not systemic.
Composite DAMAGE Score	3.8	High. Requires priority attention and dedicated controls.

Agent Impact Profile

How severity changes across the agent architecture spectrum.

Agent Type	Impact	How This Risk Manifests
manage_accounts Digital Assistant	Moderate	Human may notice incorrect memory and correct it, but correction must be explicit.
school Digital Apprentice	Moderate	Agent accumulates memory; human may spot-check but cannot audit all memory.
smart_toy Autonomous Agent	High	Fully autonomous agent accumulates memory without human verification.
share Delegating Agent	High	Agent's memory of prior delegations influences future delegations. Incorrect memory affects recommendations.
groups Agent Crew / Pipeline	Critical	Multiple agents share memory stores. Errors propagate across agents.
account_tree Agent Mesh / Swarm	Critical	Peer-to-peer agents share memory across mesh. Errors propagate systematically.

Regulatory Framework Mapping

Framework	Coverage	Citation	What It Addresses	What It Misses
BCBS 239	Partial	Principle 8 (Data Quality)	Requires data quality and accuracy.	Does not address degradation of persistent agent memory.
GDPR Article 5	Partial	Data Quality and Accuracy	Requires personal data to be accurate.	Does not address agent memory accuracy over time.
NIST AI RMF 1.0	Partial	MAP 2.2 (Data Quality)	Recommends data quality assessment.	Does not address persistent memory degradation.
EU AI Act	Minimal	General governance	General governance principles.	Does not address persistent memory accuracy.
MAS AIRG	Minimal	General governance	General governance principles.	Does not address memory degradation.

Why This Matters in Regulated Industries

In banking and insurance, customer relationship data is critical for risk assessment, compliance, and service delivery. If agents are maintaining degraded customer memory, risk assessments become unreliable, compliance decisions are made on inaccurate data, and customer service is degraded. Regulators expect institutions to maintain accurate customer records. If agent memory stores are contaminated with errors, the institution is maintaining inaccurate records.

Additionally, customer memory accuracy affects fairness. If an agent's memory of a customer is inaccurate, the agent may make unfair decisions based on false historical context. A customer may be disadvantaged by an agent's incorrect recollection of prior events.

Controls & Mitigations

architectureDesign-Time Controls

Implement "memory expiration": explicitly design memory stores to expire old facts unless they are refreshed/verified. Example: customer demographic data expires after 1 year unless refreshed.
Separate memory types: distinguish between "verified facts" (system-of-record data), "inferred facts" (agent inferences), and "customer-stated facts." Apply different confidence levels and expiration policies.
Implement memory reconciliation: periodically reconcile agent memory against system-of-record data. Flag inaccuracies and update memory.
Require explicit memory curation: for high-impact customers or decisions, require human review and curation of agent memory before use.

play_circleRuntime Controls

Implement confidence scoring on memories: each stored fact includes confidence score indicating how likely it is to be accurate. Retrieve high-confidence facts preferentially.
Monitor memory quality: periodically sample memories, compare to system-of-record, compute memory accuracy rates.
Implement memory source tracking: for each stored fact, track its source. Use source in confidence assessment.
Use Component 10 (Kill Switch) to halt agents whose memory quality falls below acceptable thresholds.

monitoringDetection & Response

Conduct quarterly memory audits: sample agent memories for high-impact customers, compare to system-of-record, document accuracy rates.
Monitor for memory-based decision errors: track decisions where agent memory was used, compare outcomes to decisions based on system-of-record data.
Implement memory reconciliation processes: periodically reconcile all agent memory against system-of-record.
Establish incident response for memory degradation: audit affected agent memories, reconcile against ground truth, update contaminated memories, assess impact on prior decisions.

Related Risks

Address This Risk in Your Institution

Persistent Memory Degradation requires architectural controls that go beyond what existing frameworks provide. Our advisory engagements are purpose-built for banks, insurers, and financial institutions subject to prudential oversight.

Schedule a Briefing

Agentic AI Risk & Controls Workshop Our Methodology Regulatory Landscape