The CRP Safety Case¶
The black box is real. The governance of its observable surface is also real. CRP is that governance layer.
CRP™ makes no claims about what happens inside the model. It governs what goes in and what is done with what comes out. This page explains, in plain English, what that safety coverage actually is — and what it is not.
Two Problems, One Protocol¶
Modern AI systems have two operational problems that protocol engineering can solve today:
- Context management — unbounded windows, automatic continuation, coherent flow across many calls. Covered by SPEC-003, SPEC-004, SPEC-005, SPEC-009.
- Safety and governance — verifiable safety signals on every call, hallucination risk scoring, fabrication detection, Safety Policy enforcement, automated regulatory evidence. Covered by SPEC-005, SPEC-006, SPEC-010, SPEC-011.
These are different problems, and CRP addresses both — at the wire level, without application code changes.
What CRP Actually Implements¶
Governance (verifiable, fully implemented)¶
| Capability | Spec |
|---|---|
| HMAC-SHA256 audit chain (per-call, per-window, per-session) | SPEC-011 |
| EU AI Act risk classification (PROHIBITED → MINIMAL) on every call | SPEC-010 |
| GDPR PII detection and redaction | SPEC-010 |
| NIST AI RMF function/category attribution | SPEC-010 |
| ISO/IEC 42001 control evidence emission | SPEC-010 |
| Cryptographically verifiable provenance DAG | SPEC-004, SPEC-011 |
Every call produces compliance evidence that a regulator can verify. This is a measurable output, not a marketing claim.
Runtime safety (verifiable, fully implemented)¶
The Decision Provenance Engine (DPE) — SPEC-005 — runs on every response:
- Claim detection
- Attribution analysis
- Fabrication detection
- Distortion detection
- Entailment scoring
- Cross-window contradiction detection
- Repetition detection
- Completeness verification
- Flow analysis
- Hallucination risk scoring
- Quality tiering (S/A/B/C/D)
- Safety Policy evaluation
- Provenance binding
The Safety Policy directive language — SPEC-006 —
allows enforcement at the transport layer. halt-on CRITICAL genuinely halts
the call with HTTP 451. redact-on HIGH PII strips PII before delivery.
These are testable, measurable protocol behaviours.
What CRP Does NOT Address¶
State this clearly. Do not let anyone conflate it.
- Model alignment — whether the LLM has "good values." CRP cannot inspect model weights.
- Training data bias — whether the training corpus introduced systematic errors.
- Emergent capability risks — what the model might do in novel situations.
- AI consciousness or sentience questions.
These are research problems being worked on by alignment teams at Anthropic, Google DeepMind, and AI safety institutes. CRP is an infrastructure protocol. A firewall does not solve social engineering. TLS does not prevent insider threats. CRP does not solve alignment — and claiming it did would undermine the legitimate things it does solve.
The Positioning¶
CRP covers the safety layer that is achievable right now through protocol engineering: observable outputs, verifiable evidence, enforceable policies.
It is the necessary complement to alignment research, not a replacement for it.
Deeper Reading¶
- The DPE pipeline — what each of the 13 stages checks for, in plain English.
- Safety Policy language — how
CRP-Safety-Policyworks (with the Content-Security-Policy analogy). - Coverage and limits — exactly what is and is not in scope.
- The black-box question — why governing an opaque model is not only possible but correct.