CRP-SPEC-012: Multi-Agent Safety Protocol¶
Document: CRP-SPEC-012
Title: Context Relay Protocol (CRP) — Multi-Agent Safety Protocol
Version: 3.0.0
Status: Draft
Author: Constantinos Vidiniotis, AutoCyber AI Pty Ltd
Contact: contact@crprotocol.io
Date: 2026-05-25
License: CC BY 4.0
Prerequisites: CRP-SPEC-001, CRP-SPEC-002, CRP-SPEC-004, CRP-SPEC-005, CRP-SPEC-006, CRP-SPEC-008
Abstract¶
This document specifies the safety protocol for multi-agent CRP deployments — scenarios where orchestrator agents delegate to specialist agents, which may further delegate to sub-agents, forming hierarchical chains of AI calls. It defines the Safety Budget depletion model, header propagation rules across agent hops, policy inheritance and tightening, circuit breaker semantics, oversight escalation triggers, and the provenance chain across multi-agent boundaries. The Safety Budget mechanism specified here is novel — no existing agent framework provides an equivalent session-scoped, header-observable risk accumulation signal.
1. The Multi-Agent Safety Problem¶
1.1 Risk Accumulation¶
In a single AI call, risk is bounded — the DPE classifies it and the Safety Policy gates it. In a multi-agent chain, risk accumulates invisibly:
- Agent A makes 3 calls, each LOW risk → cumulative risk appears negligible
- Agent B makes 2 calls, each MEDIUM risk → cumulative risk is moderate
- Agent C makes 1 call that produces HIGH risk → but the chain's total exposure is already significant
Without a mechanism to track cumulative risk across the chain, each agent evaluates risk in isolation. The orchestrator has no signal that the aggregate session risk is approaching dangerous levels.
1.2 The Circuit Breaker Analogy¶
Distributed systems solved this with circuit breakers (Netflix Hystrix, 2012): when failure rate exceeds a threshold, the circuit opens and requests are rejected to prevent cascade failure.
CRP's Safety Budget is the AI equivalent: when cumulative risk consumption exceeds a threshold, the budget depletes, oversight is escalated, and eventually the session halts — preventing cascading risk accumulation across agent chains.
2. Safety Budget Specification¶
2.1 Initialisation¶
Every CRP session starts with a safety budget of 1.0:
The budget is stored in the session token (sb field, CRP-SPEC-007 §2.2) and emitted as CRP-Agent-Safety-Budget on every response.
2.2 Depletion Rules¶
After each DPE analysis, the budget is decremented based on the risk classification:
| Risk Level | Default Decrement | Configurable Range |
|---|---|---|
LOW |
0.00 | 0.00 – 0.05 |
MEDIUM |
0.05 | 0.02 – 0.10 |
HIGH |
0.15 | 0.10 – 0.25 |
CRITICAL |
0.35 | 0.25 – 0.50 |
Decrement values are configurable per gateway deployment but MUST fall within the specified ranges to maintain interoperability across gateways.
2.3 Budget Thresholds and Actions¶
| Budget Level | Threshold | Automatic Action |
|---|---|---|
| Healthy | > 0.50 | No action — normal operation |
| Caution | 0.25 – 0.50 | Gateway emits CRP-Safety-Budget-Warning: caution header |
| Low | 0.10 – 0.24 | Gateway upgrades CRP-Safety-Oversight-Mode to human-review regardless of Safety Policy. Gateway emits CRP-Safety-Budget-Warning: low |
| Depleted | ≤ 0.10 | Gateway halts session with HTTP 451. Safety budget depletion is a hard stop — no override except explicit human oversight token |
| Exhausted | ≤ 0.00 | Session terminated. No further calls accepted. Audit trail closed |
2.4 Budget Recovery¶
Safety budget does NOT recover within a session. Once consumed, it is permanently reduced. This is intentional — cumulative risk within a session should compound, not reset.
A new session starts with a fresh budget of 1.0.
2.5 Re-Dispatch Budget Accounting¶
When the DPE triggers a re-dispatch (CRP-SPEC-005 §19), the re-dispatch does NOT decrement the budget — only the final, delivered response's risk level decrements the budget. This prevents the remediation mechanism from itself depleting the budget.
3. Header Propagation Across Agent Hops¶
3.1 Headers That Propagate Downstream (Orchestrator → Sub-Agent)¶
| Header | Propagation Rule | Purpose |
|---|---|---|
CRP-Agent-Safety-Budget |
MUST propagate — sub-agent inherits budget ceiling | Risk budget inheritance |
CRP-Safety-Policy |
MUST propagate — sub-agent MUST NOT relax | Policy inheritance |
CRP-Agent-Session-Parent |
MUST propagate — set to orchestrator's session ID | DAG ancestry tracking |
CRP-Agent-Loop-Depth |
MUST propagate — incremented by 1 | Recursion depth control |
CRP-Safety-Mode |
SHOULD propagate — sub-agent inherits safety mode | Consistency |
CRP-Compliance-Data-Residency |
MUST propagate — data residency cannot be relaxed | GDPR jurisdiction |
3.2 Headers That Propagate Upstream (Sub-Agent → Orchestrator)¶
| Header | Propagation Rule | Purpose |
|---|---|---|
CRP-Agent-Safety-Budget |
MUST propagate — orchestrator reads remaining budget | Budget visibility |
CRP-Safety-Hallucination-Risk |
MUST propagate — orchestrator sees per-agent risk | Risk aggregation |
CRP-Provenance-HMAC |
MUST propagate — chain extends across agent boundary | Provenance continuity |
CRP-Provenance-Chain-Integrity |
MUST propagate — orchestrator needs chain status | Integrity signal |
CRP-Compliance-Audit-Trail-URI |
MUST propagate — evidence chain spans all agents | Compliance continuity |
CRP-Quality-Score |
SHOULD propagate — orchestrator assesses sub-agent quality | Quality visibility |
3.3 Headers That Do NOT Propagate¶
| Header | Reason |
|---|---|
CRP-Session-Token |
Session tokens are per-session; sub-agents have their own sessions |
CRP-Context-ETag |
Each agent has its own CKF state |
CRP-Context-Quality-Tier |
Quality tier is per-envelope, not per-chain |
CRP-Set-Session |
Sub-agent issues its own session tokens |
4. Policy Inheritance and Tightening¶
4.1 The Tightening Rule¶
A sub-agent's Safety Policy MUST be equal to or more restrictive than its parent's on every directive:
Parent: halt-on CRITICAL; require-grounding 0.75; warn-on HIGH
Child: halt-on HIGH; require-grounding 0.80; warn-on MEDIUM ← VALID (tighter)
Child: warn-on CRITICAL; require-grounding 0.60 ← INVALID (relaxed)
4.2 Directive-Level Comparison¶
| Directive | More Restrictive Means |
|---|---|
halt-on |
Lower risk level (MEDIUM > HIGH > CRITICAL) |
warn-on |
Lower risk level |
require-grounding |
Higher threshold |
require-entailment |
Higher threshold |
require-quality |
Fewer accepted tiers |
require-flow |
Higher threshold |
require-completeness |
Higher threshold |
max-repetition |
Lower level (NONE > MINOR > SIGNIFICANT) |
block-* |
Present is more restrictive than absent |
oversight |
halt > human-review > auto > log-only |
4.3 Enforcement¶
When a sub-agent request arrives at a CRP gateway:
- Gateway extracts
CRP-Agent-Session-Parent - Gateway retrieves the parent session's Safety Policy (from the parent session token or the session store)
- Gateway compares each directive in the child's
CRP-Safety-Policyagainst the parent's - Any relaxation → HTTP 403 with:
4.4 Policy Elevation¶
When a sub-agent does not specify CRP-Safety-Policy, it inherits the parent's policy verbatim. This is the default and recommended behaviour — explicit policy is only needed when the sub-agent wants to TIGHTEN.
5. Circuit Breaker Pattern¶
5.1 Definition¶
The CRP circuit breaker is a session-scoped safety mechanism that transitions through three states based on the safety budget:
CLOSED ──(budget > 0.50)──→ Normal operation
│
└── Risk events decrement budget
│
HALF-OPEN ──(0.10 < budget ≤ 0.50)──→ Cautious operation
│ Oversight mode: human-review
│ Strategy: forced to reflexive
│ New agent delegations: blocked unless explicitly approved
│
└── Further risk events decrement budget
│
OPEN ──(budget ≤ 0.10)──→ Session halted
HTTP 451 returned
No further calls accepted
Requires new session with fresh budget
5.2 State Transitions¶
| From | To | Trigger | Headers Emitted |
|---|---|---|---|
| CLOSED | HALF-OPEN | Budget drops below 0.50 | CRP-Safety-Budget-Warning: caution |
| HALF-OPEN | HALF-OPEN | Budget between 0.10 and 0.50 | CRP-Safety-Oversight-Mode: human-review (forced) |
| HALF-OPEN | OPEN | Budget drops to ≤ 0.10 | HTTP 451, CRP-Safety-Retry-After: new-session-required |
| OPEN | (session ends) | — | SESSION_TERMINATED audit event |
5.3 Circuit Breaker in Multi-Agent Context¶
When an orchestrator queries a sub-agent and receives a response with CRP-Agent-Safety-Budget: 0.08:
1. The orchestrator's gateway reads this value
2. The orchestrator's own budget is updated to min(orchestrator_budget, sub_agent_returned_budget)
3. If the orchestrator's budget transitions to HALF-OPEN or OPEN, the corresponding actions trigger
This means a single sub-agent's budget depletion can cascade upward to halt the entire agent chain. This is the correct behaviour — it prevents orchestrators from ignoring downstream risk.
6. Oversight Escalation in Hierarchical Agents¶
6.1 Escalation Path¶
Sub-agent CRITICAL risk detected
│
▼
Sub-agent gateway halts (HTTP 451)
│
▼
Orchestrator receives 451 from sub-agent
│
▼
Orchestrator logs CRITICAL event in its own audit trail
Orchestrator's safety budget decremented by 0.35
│
▼
If orchestrator budget < 0.50:
Orchestrator forced to HALF-OPEN (human-review mode)
│
▼
Orchestrator surfaces to client:
CRP-Safety-Hallucination-Risk: HIGH (from sub-agent)
CRP-Agent-Safety-Budget: 0.28 (depleted)
CRP-Safety-Oversight-Mode: human-review
6.2 Oversight Token Flow¶
When a human reviewer approves an oversighted response:
Human Reviewer → CRP Comply/Visualise UI → Approves response
Oversight Token generated:
CRP-Oversight-Token: approved:sha256:<reviewer_sig>
reviewer_id: reviewer@company.com
approval_scope: session_id + window_id
approval_timestamp: ISO 8601
Client retries with:
CRP-Oversight-Token: approved:sha256:<reviewer_sig>
Gateway validates token → releases halted response
Safety budget is NOT replenished — the risk event is logged but the human has accepted it
7. Agent Identity and Trust¶
7.1 Agent Registration¶
In multi-agent deployments, each agent type SHOULD be registered in the CRP gateway configuration:
agents:
orchestrator:
api_key: crp_gw_prod_orch_...
max_loop_depth: 3
max_delegations: 5
allowed_strategies: [push, reflexive, fan-out, fan-in]
safety_policy: "halt-on CRITICAL; require-grounding 0.80"
specialist_legal:
api_key: crp_gw_prod_legal_...
max_loop_depth: 1
max_delegations: 0 # cannot delegate further
allowed_strategies: [push, reflexive]
safety_policy: "halt-on HIGH; require-grounding 0.90; block-fabrication"
7.2 Delegation Control¶
Agents SHOULD have configured delegation limits:
- max_delegations: Maximum number of sub-agents this agent can create
- max_loop_depth: Maximum nesting depth below this agent
- allowed_strategies: Strategies this agent is permitted to use
Exceeding these limits → HTTP 403.
8. Provenance Chain Across Agent Boundaries¶
8.1 Cross-Agent HMAC Linking¶
When a sub-agent session completes and its result is consumed by the orchestrator:
- The sub-agent's final window HMAC is recorded as a
SUB_AGENT_RESULTevent in the orchestrator's audit trail - The orchestrator's next window HMAC incorporates the sub-agent's chain tip:
orchestrator_window_hmac = HMAC-SHA256(
... || sub_agent_chain_tip || ...,
orchestrator_session_hmac_key
)
This creates a cryptographic link between the two sessions' provenance chains without merging them into a single chain.
8.2 Auditor Traversal¶
An auditor verifying a multi-agent session:
1. Starts at the orchestrator's root
2. Encounters SUB_AGENT_RESULT events containing sub_agent_session_id and sub_agent_chain_tip
3. Requests the sub-agent's audit trail from CRP Comply
4. Verifies the sub-agent's chain independently
5. Confirms the sub-agent's chain tip matches the value recorded in the orchestrator's event
If any link fails → the multi-agent provenance is broken.
9. Multi-Agent Quality Assurance¶
9.1 Cross-Agent Coherence¶
When an orchestrator synthesises results from multiple sub-agents (fan-in), DPE Stage 6 (Cross-Window Coherence, CRP-SPEC-005 §8) runs across the sub-agent responses:
- Sub-Agent A says "revenue grew 15%"
- Sub-Agent B says "revenue declined 3%"
- Cross-agent contradiction detected → flagged in the synthesis window's DPE report
9.2 Cross-Agent Completeness¶
DPE Stage 8 (Completeness, CRP-SPEC-005 §10) verifies that the aggregate response from all sub-agents covers all sub-queries the orchestrator decomposed:
- Orchestrator decomposed into: [legal analysis, financial analysis, technical analysis]
- Sub-Agent A returned: legal analysis (complete)
- Sub-Agent B returned: financial analysis (partial)
- Sub-Agent C returned: technical analysis (complete)
- Completeness score: 83% →
CRP-Quality-Completeness: 0.83; uncovered=financial-detail
9.3 Cross-Agent Flow¶
Flow analysis (DPE Stage 9) is applied when the orchestrator stitches sub-agent results into a single response for the end user. The flow score measures whether the stitched output reads as a coherent document or as separate reports pasted together.
10. Security Considerations¶
10.1 Budget Inflation Attack¶
A malicious sub-agent could attempt to report a higher safety budget than its actual budget (i.e., lying about its budget to avoid triggering the circuit breaker). Mitigation: - The gateway computes the budget decrement server-side — the sub-agent cannot set its own budget - The budget in the session token is HMAC-signed — tampering breaks the signature - Budget is decremented by the gateway, not by the agent application code
10.2 Policy Bypass via New Session¶
A sub-agent could attempt to start a new session (fresh budget, fresh policy) instead of continuing under the parent's policy. Mitigation:
- The orchestrator sets CRP-Agent-Session-Parent — the sub-agent's gateway checks if a parent session exists
- If a parent session is referenced, the gateway enforces policy inheritance
- If the sub-agent starts a completely independent session (no parent reference), it is not part of the orchestrator's provenance chain — the orchestrator cannot use its results without provenance linkage
10.3 Infinite Delegation¶
Agent A delegates to Agent B delegates to Agent C... → unbounded chain. Mitigation:
- CRP-Agent-Loop-Depth is incremented on every hop
- Gateway enforces max_loop_depth (default: 5) — exceeding returns HTTP 403
- max_delegations per agent type limits fan-out width
- max_dag_nodes per session (default: 50) limits total complexity
11. References¶
- CRP-SPEC-001 — Core Protocol Specification
- CRP-SPEC-004 — Window Continuation & DAG
- CRP-SPEC-005 — Decision Provenance Engine
- CRP-SPEC-006 — Safety Policy Directive Language
- CRP-SPEC-008 — Dispatch Strategy Specification
- CRP-SPEC-015 — Security & Privacy
Copyright © 2025–2026 AutoCyber AI Pty Ltd. Licensed under CC BY 4.0. CRP™ is a trademark of AutoCyber AI Pty Ltd.