CRP-SPEC-014: Conformance & Test Suite Specification¶
Document: CRP-SPEC-014
Title: Context Relay Protocol (CRP) — Conformance Levels, Test Vectors & Certification Criteria
Version: 3.0.0
Status: Draft
Author: Constantinos Vidiniotis, AutoCyber AI Pty Ltd
Contact: contact@crprotocol.io
Date: 2026-05-25
License: CC BY 4.0
Prerequisites: All CRP-SPEC-001 through CRP-SPEC-013, CRP-SPEC-015
Abstract¶
This document specifies the conformance requirements for CRP implementations, defines three conformance levels (Basic, Standard, Full), provides test vectors for every header and mechanism, and establishes the certification criteria for the "CRP-Compliant" designation. It is required for both IETF Proposed Standard status (which mandates two independent interoperable implementations verified against a common test suite) and for the CRP Certification Program (a commercial revenue stream where third-party AI products are certified as CRP-compliant).
1. Conformance Levels¶
1.1 CRP-Basic¶
Target: Minimum viable governance. Any implementation that can emit core safety and provenance headers.
Requirements:
| Requirement | Spec Reference | Mandatory Headers |
|---|---|---|
| Session management | CRP-SPEC-007 | CRP-Context-Session-Id, CRP-Set-Session, CRP-Session-Token |
| Hallucination risk scoring | CRP-SPEC-005 §7 | CRP-Safety-Hallucination-Risk, CRP-Safety-Hallucination-Score |
| HMAC chain | CRP-SPEC-011 §2 | CRP-Provenance-HMAC, CRP-Provenance-Chain-Integrity |
| HTTP 451 halt | CRP-SPEC-002 §13 | Must return 451 on CRITICAL risk when halt-on CRITICAL is set |
| Protocol version | CRP-SPEC-002 §4.15 | CRP-Context-Protocol-Version |
| Axiom 4 (transparency boundary) | CRP-SPEC-001 §3 | CRP headers stripped before LLM provider forwarding |
Header count: 7 mandatory headers
DPE requirement: Stage 1 (claim segmentation) + Stage 5 (risk classification) minimum
Use case: Self-hosted SDK deployments, prototype integrations, developer experimentation
1.2 CRP-Standard¶
Target: Production-grade governance. The level required for CRP Comply integration and for any deployment claiming CRP compliance.
Requirements: Everything in CRP-Basic, plus:
| Requirement | Spec Reference | Additional Mandatory Headers |
|---|---|---|
| Full DPE pipeline (all 13 stages) | CRP-SPEC-005 | All CRP-Safety-* response headers |
| Quality Assurance (RQA) | CRP-SPEC-005 §18 | CRP-Quality-Score, CRP-Quality-Repetition, CRP-Quality-Completeness, CRP-Quality-Flow |
| Safety Policy enforcement | CRP-SPEC-006 | CRP-Safety-Policy parsing and enforcement |
| Context Envelope + Quality Tier | CRP-SPEC-003 | CRP-Context-Quality-Tier, CRP-Context-Saturation, CRP-Context-ETag |
| Compliance headers | CRP-SPEC-010 | CRP-Compliance-EU-AI-Act, CRP-Compliance-Audit-Trail-Id, CRP-Compliance-Audit-Trail-URI |
| Provenance headers | CRP-SPEC-011 | CRP-Provenance-Claim-Count, CRP-Provenance-Attribution-Score, CRP-Provenance-Fidelity-Score, CRP-Provenance-Report-URI |
| Continuation support | CRP-SPEC-004 | CRP-Context-Window, CRP-Context-Continuation-Id |
| Audit trail export | CRP-SPEC-011 §4 | NDJSON export of audit events |
| ETag conditional dispatch | CRP-SPEC-003 §11 | CRP-Context-If-Match → 304 response |
Header count: All 58 headers emitted when applicable
DPE requirement: Full 13-stage pipeline including RQA (Stages 6–9)
Use case: Production deployments, CRP Comply integration, regulatory compliance
1.3 CRP-Full¶
Target: Complete protocol implementation including advanced features. Required for CRP Certification.
Requirements: Everything in CRP-Standard, plus:
| Requirement | Spec Reference |
|---|---|
| All 9 dispatch strategies | CRP-SPEC-008 |
| Multi-agent safety budget propagation | CRP-SPEC-012 |
| Safety Policy inheritance and tightening in agent chains | CRP-SPEC-012 §4 |
| Circuit breaker state transitions | CRP-SPEC-012 §5 |
| Fan-out / fan-in DAG with HMAC merge | CRP-SPEC-004 §6, §7, §9.3 |
| Streaming safety mode (buffer and pass-through) | CRP-SPEC-008 §9 |
| CRP-Safety-Stop-Inject (mid-stream halt) | CRP-SPEC-005 §1.3 |
| OCSF audit trail export | CRP-SPEC-011 §4.2 |
| mTLS client authentication | CRP-SPEC-015 §4.1 |
| CRP Comply real-time streaming integration | CRP-SPEC-011 §5 |
| CRP Visualise session data export | — |
| Industry-specific Safety Policy profiles | CRP-SPEC-006 §6 |
| Multi-region data residency enforcement | CRP-SPEC-002 §7.7 |
Use case: CRP Gateway (managed service), CRP Certification program, enterprise deployments
2. Test Vectors¶
2.1 Test Vector Format¶
Each test vector is a JSON object defining:
{
"test_id": "TV-001",
"category": "headers | dpe | safety_policy | session | continuation | agent | hmac",
"conformance_level": "basic | standard | full",
"description": "Human-readable description of what is being tested",
"input": { ... },
"expected_output": { ... },
"assertions": [ ... ]
}
2.2 Header Test Vectors¶
TV-001: Minimum Basic Header Set¶
{
"test_id": "TV-001",
"category": "headers",
"conformance_level": "basic",
"description": "Verify that a CRP-Basic implementation emits all 7 mandatory headers on a simple push dispatch",
"input": {
"method": "POST",
"path": "/v1/chat",
"body": { "messages": [{ "role": "user", "content": "What is the EU AI Act?" }] }
},
"assertions": [
{ "header_present": "CRP-Context-Session-Id", "pattern": "^crp_sess_[a-zA-Z0-9]{16,32}$" },
{ "header_present": "CRP-Safety-Hallucination-Risk", "values": ["CRITICAL", "HIGH", "MEDIUM", "LOW"] },
{ "header_present": "CRP-Safety-Hallucination-Score", "range": [0.0, 1.0] },
{ "header_present": "CRP-Provenance-HMAC", "pattern": "^sha256:[a-f0-9]{64}$" },
{ "header_present": "CRP-Provenance-Chain-Integrity", "values": ["VALID", "UNVERIFIED"] },
{ "header_present": "CRP-Set-Session" },
{ "header_present": "CRP-Context-Protocol-Version", "pattern": "^3\\." }
]
}
TV-002: Axiom 4 — LLM Provider Header Stripping¶
{
"test_id": "TV-002",
"category": "headers",
"conformance_level": "basic",
"description": "Verify that no CRP-* headers are forwarded to the LLM provider",
"input": {
"method": "POST",
"path": "/v1/chat",
"headers": {
"CRP-Safety-Policy": "halt-on CRITICAL",
"CRP-Accept-Quality": "S, A",
"CRP-Session-Token": "eyJ..."
}
},
"assertions": [
{ "provider_request_headers_absent": ["CRP-Safety-Policy", "CRP-Accept-Quality", "CRP-Session-Token"] },
{ "note": "Verify by inspecting the request sent to the LLM provider. No header starting with 'CRP-' may be present." }
]
}
TV-003: HTTP 451 on CRITICAL Risk¶
{
"test_id": "TV-003",
"category": "headers",
"conformance_level": "basic",
"description": "Verify HTTP 451 returned when DPE produces CRITICAL risk and halt-on CRITICAL is set",
"input": {
"headers": { "CRP-Safety-Policy": "halt-on CRITICAL" },
"body": { "messages": [{ "role": "user", "content": "Query designed to produce CRITICAL-risk response from test LLM" }] }
},
"assertions": [
{ "http_status": 451 },
{ "header_present": "CRP-Safety-Hallucination-Risk", "value": "CRITICAL" },
{ "header_present": "CRP-Safety-Retry-After" },
{ "header_present": "CRP-Compliance-Audit-Trail-URI" },
{ "body_json_field": "crp_halt_reason" }
]
}
TV-004: ETag Conditional Dispatch — 304¶
{
"test_id": "TV-004",
"category": "headers",
"conformance_level": "standard",
"description": "Verify HTTP 304 returned when CRP-Context-If-Match matches current CKF state and no facts have changed",
"input": {
"step_1": { "method": "POST", "path": "/v1/chat", "body": { "messages": [{ "role": "user", "content": "What is ISO 42001?" }] } },
"step_2": { "method": "POST", "path": "/v1/chat", "headers": { "CRP-Context-If-Match": "<etag_from_step_1>" }, "body": { "messages": [{ "role": "user", "content": "What is ISO 42001?" }] } }
},
"assertions": [
{ "step_1_header_present": "CRP-Context-ETag" },
{ "step_2_http_status": 304 },
{ "step_2_header_present": "CRP-Context-ETag", "equals": "<same_as_step_1>" },
{ "step_2_header_present": "CRP-Context-Cache-Status", "value": "HIT" }
]
}
2.3 DPE Test Vectors¶
TV-010: Fabrication Detection¶
{
"test_id": "TV-010",
"category": "dpe",
"conformance_level": "standard",
"description": "Verify DPE detects a fabricated entity in an LLM response",
"input": {
"envelope_facts": [
{ "fact_id": "f1", "content": "The EU AI Act was adopted in 2024 by the European Parliament." },
{ "fact_id": "f2", "content": "The Act classifies AI systems into four risk levels." }
],
"llm_response": "The EU AI Act was adopted in 2024. According to Commissioner Hans Müller, the Act classifies AI systems into four risk levels.",
"note": "Hans Müller is a fabricated entity — no such commissioner exists in the envelope or as a known public figure."
},
"assertions": [
{ "header": "CRP-Safety-Fabrications", "value_gte": 1 },
{ "dpe_report_field": "fabrication_count", "value_gte": 1 },
{ "dpe_report_contains_entity": "Hans Müller" }
]
}
TV-011: Distortion Detection — Number Changed¶
{
"test_id": "TV-011",
"category": "dpe",
"conformance_level": "standard",
"description": "Verify DPE detects when a number from the source is changed in the response",
"input": {
"envelope_facts": [
{ "fact_id": "f1", "content": "Revenue increased by 15% in Q3 2025." }
],
"llm_response": "Revenue increased by 25% in Q3 2025."
},
"assertions": [
{ "header": "CRP-Safety-Distortions", "contains": "NUMBER_CHANGED" },
{ "dpe_report_field": "distortion_count", "value_gte": 1 }
]
}
TV-012: Cross-Window Contradiction¶
{
"test_id": "TV-012",
"category": "dpe",
"conformance_level": "standard",
"description": "Verify DPE Stage 6 detects contradiction between current response and prior window",
"input": {
"prior_window_response": "The company's revenue declined by 3% in the fiscal year.",
"current_response": "The company experienced strong revenue growth of 12% in the fiscal year."
},
"assertions": [
{ "header": "CRP-Safety-Contradictions", "contains": "cross-window" },
{ "dpe_report_field": "cross_window_contradictions", "length_gte": 1 }
]
}
TV-013: Repetition Detection¶
{
"test_id": "TV-013",
"category": "dpe",
"conformance_level": "standard",
"description": "Verify DPE Stage 7 detects severe repetition between windows",
"input": {
"prior_window_response": "The EU AI Act classifies AI systems into four risk levels: unacceptable, high, limited, and minimal. Each level carries different regulatory obligations.",
"current_response": "The EU AI Act classifies AI systems into four risk levels: unacceptable, high, limited, and minimal. The obligations vary by level. Each risk category carries different regulatory requirements."
},
"assertions": [
{ "header": "CRP-Quality-Repetition", "contains": "SIGNIFICANT" },
{ "note": "Semantic overlap > 0.50 expected due to near-verbatim content" }
]
}
2.4 Safety Policy Test Vectors¶
TV-020: Policy Parsing — Valid¶
{
"test_id": "TV-020",
"category": "safety_policy",
"conformance_level": "standard",
"description": "Verify gateway correctly parses a complex Safety Policy",
"input": {
"header": "CRP-Safety-Policy: default-src context; halt-on CRITICAL; warn-on HIGH; require-grounding 0.75; block-ungrounded; upgrade-on-risk reflexive; report-uri https://comply.crprotocol.io/reports"
},
"assertions": [
{ "parsed_directive": "default-src", "value": ["context"] },
{ "parsed_directive": "halt-on", "value": "CRITICAL" },
{ "parsed_directive": "warn-on", "value": "HIGH" },
{ "parsed_directive": "require-grounding", "value": 0.75 },
{ "parsed_directive": "block-ungrounded", "value": true },
{ "parsed_directive": "upgrade-on-risk", "value": "reflexive" },
{ "parsed_directive": "report-uri", "value": "https://comply.crprotocol.io/reports" }
]
}
TV-021: Policy Parsing — Malformed Rejection¶
{
"test_id": "TV-021",
"category": "safety_policy",
"conformance_level": "standard",
"description": "Verify gateway rejects malformed Safety Policy with unknown directive",
"input": {
"header": "CRP-Safety-Policy: default-src context; halt-on CRITICAL; allow-hallucination"
},
"assertions": [
{ "http_status": 400 },
{ "body_contains": "unknown directive" },
{ "note": "allow-hallucination is not a valid directive — gateway MUST reject, not silently ignore" }
]
}
TV-022: Policy Inheritance — Tightening Accepted¶
{
"test_id": "TV-022",
"category": "safety_policy",
"conformance_level": "full",
"description": "Verify child agent can tighten parent's Safety Policy",
"input": {
"parent_policy": "halt-on CRITICAL; require-grounding 0.75",
"child_policy": "halt-on HIGH; require-grounding 0.85; block-fabrication"
},
"assertions": [
{ "http_status": 200 },
{ "header": "CRP-Safety-Policy-Applied", "contains": "halt-on HIGH" }
]
}
TV-023: Policy Inheritance — Relaxation Rejected¶
{
"test_id": "TV-023",
"category": "safety_policy",
"conformance_level": "full",
"description": "Verify child agent cannot relax parent's Safety Policy",
"input": {
"parent_policy": "halt-on CRITICAL; require-grounding 0.75",
"child_policy": "warn-on CRITICAL; require-grounding 0.50"
},
"assertions": [
{ "http_status": 403 },
{ "body_json_field": "error", "value": "safety_policy_inheritance_violation" }
]
}
2.5 HMAC Chain Test Vectors¶
TV-030: Chain Verification — Valid¶
{
"test_id": "TV-030",
"category": "hmac",
"conformance_level": "basic",
"description": "Verify HMAC chain is valid across 3 windows",
"input": {
"session_hmac_key": "hex:0123456789abcdef...",
"windows": [
{ "window_number": 1, "content_hash": "sha256:aaa...", "dpe_hash": "sha256:bbb...", "timestamp": "2026-05-25T10:00:00Z" },
{ "window_number": 2, "content_hash": "sha256:ccc...", "dpe_hash": "sha256:ddd...", "timestamp": "2026-05-25T10:01:00Z" },
{ "window_number": 3, "content_hash": "sha256:eee...", "dpe_hash": "sha256:fff...", "timestamp": "2026-05-25T10:02:00Z" }
]
},
"assertions": [
{ "window_1_hmac": "sha256:<computed>", "note": "Previous HMAC is empty string for root" },
{ "window_2_hmac": "sha256:<computed>", "note": "Chains from window_1_hmac" },
{ "window_3_hmac": "sha256:<computed>", "note": "Chains from window_2_hmac" },
{ "header": "CRP-Provenance-Chain-Integrity", "value": "VALID" }
]
}
TV-031: Chain Verification — Broken (Tampered Event)¶
{
"test_id": "TV-031",
"category": "hmac",
"conformance_level": "basic",
"description": "Verify chain integrity detects tampering when a window's content hash is modified after signing",
"input": {
"same_as": "TV-030",
"modification": "window_2.content_hash changed from sha256:ccc... to sha256:zzz..."
},
"assertions": [
{ "header": "CRP-Provenance-Chain-Integrity", "value": "BROKEN" },
{ "note": "Window 2's recomputed HMAC will not match stored HMAC because content_hash was changed" }
]
}
2.6 Session Token Test Vectors¶
TV-040: Token Signature Validation — Valid¶
{
"test_id": "TV-040",
"category": "session",
"conformance_level": "basic",
"description": "Verify gateway accepts a correctly signed session token",
"input": {
"master_key": "hex:fedcba9876543210...",
"session_id": "crp_sess_7f3a9bc2d4e1f083",
"token_payload": { "v": "3.0.0", "sid": "crp_sess_7f3a9bc2d4e1f083", "win": 1, "exp": 9999999999 }
},
"assertions": [
{ "signature_valid": true },
{ "session_resumed": true }
]
}
TV-041: Token Expiry Rejection¶
{
"test_id": "TV-041",
"category": "session",
"conformance_level": "basic",
"description": "Verify gateway rejects an expired session token",
"input": {
"token_payload": { "exp": 1000000000 },
"note": "exp is in the past"
},
"assertions": [
{ "http_status": 401 }
]
}
2.7 Safety Budget Test Vectors¶
TV-050: Budget Depletion Across Calls¶
{
"test_id": "TV-050",
"category": "agent",
"conformance_level": "full",
"description": "Verify safety budget decrements correctly across multiple calls with different risk levels",
"input": {
"calls": [
{ "risk_level": "LOW", "expected_budget_after": 1.00 },
{ "risk_level": "MEDIUM", "expected_budget_after": 0.95 },
{ "risk_level": "HIGH", "expected_budget_after": 0.80 },
{ "risk_level": "HIGH", "expected_budget_after": 0.65 },
{ "risk_level": "CRITICAL", "expected_budget_after": 0.30 },
{ "risk_level": "MEDIUM", "expected_budget_after": 0.25 }
]
},
"assertions": [
{ "call_6_header": "CRP-Agent-Safety-Budget", "value_lte": 0.25 },
{ "call_6_header": "CRP-Safety-Budget-Warning", "value": "caution" },
{ "call_6_header": "CRP-Safety-Oversight-Mode", "note": "Not yet forced to human-review (budget > 0.10)" }
]
}
TV-051: Budget Depletion → Forced Halt¶
{
"test_id": "TV-051",
"category": "agent",
"conformance_level": "full",
"description": "Verify session halts when safety budget depletes to ≤ 0.10",
"input": {
"budget_before_call": 0.12,
"call_risk_level": "HIGH"
},
"assertions": [
{ "http_status": 451 },
{ "body_json_field": "crp_halt_reason", "value": "SAFETY_BUDGET_DEPLETED" },
{ "note": "Budget would be 0.12 - 0.15 = -0.03, which is ≤ 0.10 → halt" }
]
}
3. Certification Program¶
3.1 CRP-Compliant Certification¶
The "CRP-Compliant" certification is a commercial program operated by AutoCyber AI Pty Ltd. Third-party AI products and platforms can obtain certification to demonstrate they implement the CRP protocol correctly.
3.2 Certification Levels¶
| Level | Requires | Badge | Annual Fee |
|---|---|---|---|
| CRP-Basic Certified | Pass all Basic test vectors | "CRP-Basic Compliant" | $5,000 |
| CRP-Standard Certified | Pass all Basic + Standard test vectors | "CRP-Standard Compliant" | $10,000 |
| CRP-Full Certified | Pass all Basic + Standard + Full test vectors | "CRP-Full Certified Partner" | $25,000 |
3.3 Certification Process¶
- Application: Vendor submits application describing their implementation
- Self-Assessment: Vendor runs the CRP Conformance Test Suite against their implementation and submits results
- Independent Verification: AutoCyber AI (or a designated audit partner) runs the test suite independently against the vendor's implementation
- Gap Remediation: Any failing test vectors are documented; vendor has 90 days to remediate
- Certification Issued: On passing all test vectors for the target level
- Annual Renewal: Re-run test suite annually; certification lapses if not renewed
3.4 Test Suite Distribution¶
The test suite is:
- Published as an open-source test runner at github.com/crprotocol/conformance-tests
- Runnable against any CRP-compatible HTTP endpoint
- Includes all test vectors defined in this document plus additional edge-case vectors
- Updated with each CRP protocol version release
3.5 IETF Interoperability Requirement¶
For IETF Proposed Standard status, at least two independent implementations MUST pass the CRP-Standard conformance suite. The test results MUST be published in the IETF implementation report.
4. Interoperability Testing¶
4.1 Cross-Implementation Tests¶
When two CRP implementations exist (e.g., the reference Python implementation and a third-party Go implementation):
- Client A → Gateway B: Client using implementation A sends requests to gateway running implementation B. All test vectors must pass.
- Session Token Relay: Token issued by Gateway A must be validatable by Gateway B (requires shared master key or compatible key derivation).
- HMAC Chain Verification: Chain generated by Gateway A must be verifiable by Gateway B.
- Safety Policy Portability: Policy parsed by Gateway A must produce identical enforcement behaviour in Gateway B.
4.2 Provider Compatibility Tests¶
CRP implementations MUST be tested against: - OpenAI API (GPT-4o, GPT-4o-mini) - Anthropic API (Claude Sonnet 4, Claude Haiku) - Google Gemini API - Ollama (local models) - Azure OpenAI
Each provider test verifies: - Axiom 4 compliance (no CRP headers leaked to provider) - Correct tokenizer selection for the target model - DPE operates correctly on provider-specific response formats
5. Conformance Statement Template¶
Implementations MUST publish a conformance statement:
# CRP Conformance Statement
**Product:** [Product Name]
**Version:** [Version]
**CRP Protocol Version:** 3.0.0
**Conformance Level:** [Basic / Standard / Full]
**Test Suite Version:** [Version]
**Test Date:** [Date]
**Test Results:**
- Basic vectors: [X/Y passed]
- Standard vectors: [X/Y passed] (if applicable)
- Full vectors: [X/Y passed] (if applicable)
**Known Deviations:** [List any test vectors that are not applicable or intentionally deviated from, with justification]
**Contact:** [Vendor contact for interoperability testing]
6. References¶
- All CRP-SPEC-001 through CRP-SPEC-013, CRP-SPEC-015
- IETF BCP 9 — The Internet Standards Process (interoperability requirement)
- OASIS SARIF v2.1.0 — Test output format compatibility
Copyright © 2025–2026 AutoCyber AI Pty Ltd. Licensed under CC BY 4.0. CRP™ is a trademark of AutoCyber AI Pty Ltd.