R-DG-07 Data Governance & Integrity DAMAGE 3.6 / High

Derived Data Accountability Gap

Agent-derived data enters workflows without metadata distinguishing it from system-of-record data. Existing ownership models cannot assign accountability for agent-generated outputs.

The Risk

Data governance frameworks distinguish between system-of-record (authoritative source, owned and maintained by specific teams) and derived data (computed from system-of-record, owned by the team that computed it). This distinction enables accountability: if system-of-record data is incorrect, the system owner is responsible. If derived data is incorrect, the team that created the derivation is responsible. The distinction breaks down with agents because derived data generated by agents is often indistinguishable from system-of-record data, and ownership responsibility is ambiguous.

An agent generates a risk score. The score is derived from system-of-record customer data, but who owns it? Is it owned by the agent's developers? The team that deployed the agent? The data engineering team that maintains the vector store the agent uses? The accountability chain is broken. When the risk score causes a bad decision (customer declined for credit based on incorrect agent-generated score), there is no clear owner responsible for the derivation error. Each team can claim the error was not their responsibility; the accountability gap allows the error to fall through.

This accountability gap becomes systemic when agent-derived data enters shared data stores without clear ownership metadata. Downstream teams use the derived data, but the metadata does not indicate who created it or how to contact the responsible team for questions or corrections.

How It Materializes

A regional bank's compliance team uses agents to identify suspicious customers for enhanced due diligence. The agents score customers based on transaction patterns, network analysis, and regulatory watch list matching. The agent outputs a score (1-100, with >70 indicating enhanced due diligence required). The score is stored in the bank's customer risk database, alongside true risk assessments performed by compliance officers. There is no metadata distinguishing agent-derived scores from officer-derived scores.

A compliance officer reviews customer accounts with scores >70. Half the accounts in the queue are driven by agent scores; half by officer assessments. The officer cannot tell which is which. When an officer reviews an agent-generated score and disagrees with it, the officer updates the score to reflect their professional judgment. The original agent score is overwritten. The agent's input is lost.

Later, the bank discovers that certain agent scores were systematically over-flagging customers from particular geographic regions due to bias in the training data. The bank needs to identify and recalculate all affected scores. But the bank cannot identify which scores in the database were generated by the agent because the scores lack ownership metadata. The bank manually audits 5,000 customer risk scores trying to determine which ones came from the agent and which from officers. The audit is expensive and error-prone. The bank discovers it has no clear ownership path to request corrections: should it contact the agent development team? The compliance team? Neither team accepts responsibility for correcting the scores because ownership is ambiguous.

DAMAGE Score Breakdown

DimensionScoreRationale
D - Detectability3Ownership ambiguity is often undetectable until incident occurs or ownership dispute arises.
A - Autonomy Sensitivity3Occurs at all autonomy levels. Ownership ambiguity is a governance design issue, not a function of autonomy.
M - Multiplicative Potential3Each agent-derived dataset lacking ownership metadata compounds the problem. Multiple agents create exponential ownership confusion.
A - Attack Surface2Not easily weaponized externally; primarily a governance design issue.
G - Governance Gap5Data governance frameworks assume clear ownership for all datasets. Agent-derived data breaks the ownership model.
E - Enterprise Impact3Inability to correct errors efficiently, unclear responsibility for accuracy, potential regulatory findings on data governance.
Composite DAMAGE Score3.6High. Requires priority attention with dedicated controls and monitoring.

Agent Impact Profile

How severity changes across the agent architecture spectrum.

Agent TypeImpactHow This Risk Manifests
Digital AssistantLow-ModerateHuman reviewer may own the output implicitly, but system metadata is still ambiguous.
Digital ApprenticeModerateOwnership progressively transferred from development team to agent team, but transition is not explicitly documented.
Autonomous AgentHighNo human reviewer to own output. Ownership defaults to unclear party.
Delegating AgentHighAgent determines which tools to invoke. Ownership of tool outputs is unclear; does agent team own them? Tool team?
Agent Crew / PipelineHighMultiple agents. Ownership of intermediate and final outputs is distributed and unclear.
Agent Mesh / SwarmCriticalPeer-to-peer agent network. Output ownership is completely unclear. No single responsible party.

Regulatory Framework Mapping

FrameworkCoverageCitationWhat It AddressesWhat It Misses
BCBS 239PartialPrinciple 1 (Governance)Requires clear responsibility assignment for data governance.Does not address derived data ownership in agent systems.
EU AI ActPartialArticle 24 (Documentation)Requires documentation of AI system ownership and responsibility.Does not specify metadata requirements for derived data ownership.
NIST AI RMF 1.0PartialGOVERN 1.1Recommends clear roles and responsibilities.Does not address derived data ownership in agent systems.
MAS AIRGPartialAppendix 2 (Governance)Requires clear accountability and governance structures.Does not address agent-derived data ownership.
ISO 42001PartialSection 5.1Requires organizational commitment to roles and responsibility.Does not address derived data ownership in AI systems.
SOX 404PartialIT ControlsRequires control and oversight of financial systems.Does not address AI-derived data ownership.

Why This Matters in Regulated Industries

In banking, credit decisions are made based on risk scores. If a score is wrong and a customer is denied credit, someone must be accountable for the error. If the score was agent-derived but the agent team claims no responsibility (saying it was the deployment team's responsibility), and the deployment team claims no responsibility (saying it was the development team's), then no one is accountable. The customer has no clear path to challenge the error or to understand why the decision was made. Regulators expect institutions to maintain clear accountability for all data used in decisions.

In insurance, underwriting decisions are based on risk assessments. If an assessment is agent-derived, someone must own the methodology and accuracy. If ownership is ambiguous, the insurance company cannot defend the underwriting decision to regulators or to customers. The institution loses credibility in its ability to explain and defend its decisions.

Controls & Mitigations

Design-Time Controls

  • Implement mandatory metadata standards for all derived data: every agent-generated data element must include source agent ID, generation timestamp, methodology summary, confidence score, and owner contact information.
  • Establish clear ownership assignment at agent design time: designate a specific team or individual responsible for all outputs of each agent. Document ownership in Component 1 (Agent Registry).
  • Implement data lineage metadata: track the complete lineage of agent-derived data, including agent ID, input data elements, reasoning summary, and ownership at each step.
  • Require derived data to be stored in separate schema tables (not commingled with system-of-record data). Enforce schema separation through data governance tools.

Runtime Controls

  • Attach cryptographic identity (Component 2) to all agent-derived data: include agent signature, generation timestamp, and ownership information in every output. Make this metadata immutable.
  • Implement ownership lookup services: query the Agent Registry to determine responsible owner for any agent-derived dataset. Provide automated notifications to owners when their derived data is accessed.
  • Require agent outputs to include explicit disclaimers about ownership and methodology. Propagate these disclaimers to downstream systems consuming agent outputs.
  • Use Component 10 (Kill Switch) to halt any agent whose outputs lack complete ownership and methodology metadata.

Detection & Response

  • Conduct quarterly audits of derived data ownership: sample derived datasets, verify ownership metadata is complete and accurate, contact listed owners to confirm responsibility.
  • Implement workflows for handling corrections: when errors in agent-derived data are discovered, automatically notify the listed owner, track correction status, and require sign-off.
  • Monitor for ownership ambiguity: detect when multiple teams claim responsibility for the same derived dataset or when no team claims responsibility. Escalate to data governance.
  • Establish derived data correction incident response: audit all decisions based on incorrect derived data, notify affected customers or counterparties, implement corrective actions.

Related Risks

Address This Risk in Your Institution

Derived Data Accountability Gap requires architectural controls that go beyond what existing frameworks provide. Our advisory engagements are purpose-built for banks, insurers, and financial institutions subject to prudential oversight.

Schedule a Briefing