Skip to content

The CRP Safety Case

The black box is real. The governance of its observable surface is also real. CRP is that governance layer.

CRP™ makes no claims about what happens inside the model. It governs what goes in and what is done with what comes out. This page explains, in plain English, what that safety coverage actually is — and what it is not.


Two Problems, One Protocol

Modern AI systems have two operational problems that protocol engineering can solve today:

  1. Context management — unbounded windows, automatic continuation, coherent flow across many calls. Covered by SPEC-003, SPEC-004, SPEC-005, SPEC-009.
  2. Safety and governance — verifiable safety signals on every call, hallucination risk scoring, fabrication detection, Safety Policy enforcement, automated regulatory evidence. Covered by SPEC-005, SPEC-006, SPEC-010, SPEC-011.

These are different problems, and CRP addresses both — at the wire level, without application code changes.


What CRP Actually Implements

Governance (verifiable, fully implemented)

Capability Spec
HMAC-SHA256 audit chain (per-call, per-window, per-session) SPEC-011
EU AI Act risk classification (PROHIBITED → MINIMAL) on every call SPEC-010
GDPR PII detection and redaction SPEC-010
NIST AI RMF function/category attribution SPEC-010
ISO/IEC 42001 control evidence emission SPEC-010
Cryptographically verifiable provenance DAG SPEC-004, SPEC-011

Every call produces compliance evidence that a regulator can verify. This is a measurable output, not a marketing claim.

Runtime safety (verifiable, fully implemented)

The Decision Provenance Engine (DPE) — SPEC-005 — runs on every response:

  1. Claim detection
  2. Attribution analysis
  3. Fabrication detection
  4. Distortion detection
  5. Entailment scoring
  6. Cross-window contradiction detection
  7. Repetition detection
  8. Completeness verification
  9. Flow analysis
  10. Hallucination risk scoring
  11. Quality tiering (S/A/B/C/D)
  12. Safety Policy evaluation
  13. Provenance binding

The Safety Policy directive language — SPEC-006 — allows enforcement at the transport layer. halt-on CRITICAL genuinely halts the call with HTTP 451. redact-on HIGH PII strips PII before delivery. These are testable, measurable protocol behaviours.


What CRP Does NOT Address

State this clearly. Do not let anyone conflate it.

  • Model alignment — whether the LLM has "good values." CRP cannot inspect model weights.
  • Training data bias — whether the training corpus introduced systematic errors.
  • Emergent capability risks — what the model might do in novel situations.
  • AI consciousness or sentience questions.

These are research problems being worked on by alignment teams at Anthropic, Google DeepMind, and AI safety institutes. CRP is an infrastructure protocol. A firewall does not solve social engineering. TLS does not prevent insider threats. CRP does not solve alignment — and claiming it did would undermine the legitimate things it does solve.


The Positioning

CRP covers the safety layer that is achievable right now through protocol engineering: observable outputs, verifiable evidence, enforceable policies.

It is the necessary complement to alignment research, not a replacement for it.


Deeper Reading