Operational Resilience Risks

8 Risks

Risks from agents interacting with production infrastructure: transaction systems, workflow engines, APIs, data stores, and compute resources. How autonomous systems create failure modes that existing operational resilience frameworks were not designed to detect.

Positioning Note

DORA, MAS TRM Guidelines, FFIEC IT Handbook, and institutional business continuity programs address operational resilience for technology systems. These risks document the specific ways autonomous agents create failure modes in production infrastructure that existing resilience controls (circuit breakers, rate limits, transaction monitoring, workflow governance) were not designed to detect or contain. Each risk names the infrastructure the agent interacts with and the resilience control it bypasses.

Category Overview

DORA, MAS TRM Guidelines, FFIEC IT Handbook, and institutional business continuity programs address operational resilience for technology systems. These risks document the specific ways autonomous agents create failure modes in production infrastructure that existing resilience controls were not designed to detect or contain. Each risk names the infrastructure the agent interacts with and the resilience control it bypasses.

What makes these risks specifically agentic is the agent's ability to interact with production systems through natural-language-mediated tool calls that do not follow the structured protocols those systems expect. An agent can initiate a transaction and lose context mid-process, advance a workflow past required approvals, or create recursive loops that consume infrastructure resources. These are not the failure modes that circuit breakers and rate limits were calibrated to detect.

Who should care

Infrastructure teams, operational risk managers, business continuity planners, and regulatory compliance teams responsible for DORA, MAS TRM, or FFIEC requirements.

Aggregate DAMAGE Profile

3.7
Average DAMAGE Score
4.3
Highest: R-OR-03 Approval Chain Bypass
3
Critical-Tier Risks
CriticalHighModerateLow
3410

All Operational Resilience Risks

R-OR-014.1
Transaction Integrity Failure

Agent can initiate a transaction, lose context mid-process, and fail to complete, rollback, or confirm. Resulting state is invisible to transaction monitoring.

R-OR-023.7
Workflow State Corruption

Agent can advance a case past a required approval step, create parallel branches, or leave workflow instances in states that have no defined transition.

R-OR-034.3
Approval Chain Bypass

Agents operating with delegated authority can submit requests and satisfy approval requirements using the same authority. The approval chain is functionally collapsed.

R-OR-043.8
Resource Exhaustion and Runaway Loops

Agents can create recursive loops. Each invocation appears as a legitimate, independent request. The loop consumes compute until infrastructure fails.

R-OR-053.5
API Dependency Failure and Silent Degradation

When an external API degrades subtly, the agent continues operating on degraded inputs. Circuit breakers trip on errors, not on semantic degradation.

R-OR-064.2
Cascading Infrastructure Failure

Agent delegation paths create cascading failures through systems that have no documented dependency relationship. Blast radius exceeds DR planning.

R-OR-073.4
Tool Misuse and Unintended Side Effects

Agents invoke tools through natural language interfaces that may expose operations the agent was never intended to use.

R-OR-082.6
Operational Waste Accumulation

Agents generate operational waste: unnecessary data movement, excess permissions, unused capabilities, repeated failed actions without root cause analysis.

Related Categories

Address Operational Resilience Risks

Agent-specific failure modes require controls that DORA, MAS TRM, and FFIEC were not designed to address. Our advisory engagements help institutions build resilience frameworks for autonomous systems in production infrastructure.

Schedule a Briefing