CRP-SPEC-014: Conformance & Test Suite Specification¶

Document: CRP-SPEC-014
Title: Context Relay Protocol (CRP) - Conformance Levels, Test Vectors & Certification Criteria
Version: 3.0.0
Status: Draft
Author: Constantinos Vidiniotis, AutoCyber AI Pty Ltd
Contact: contact@crprotocol.io
Date: 2026-05-25
License: CC BY 4.0
Prerequisites: All CRP-SPEC-001 through CRP-SPEC-013, CRP-SPEC-015

Abstract¶

This document specifies the conformance requirements for CRP implementations, defines three conformance levels (Basic, Standard, Full), provides test vectors for every header and mechanism, and establishes the certification criteria for the "CRP-Compliant" designation. It is required for both IETF Proposed Standard status (which mandates two independent interoperable implementations verified against a common test suite) and for the CRP Certification Program (a program under which third-party AI products are certified as CRP-compliant).

1. Conformance Levels¶

1.1 CRP-Basic¶

Target: Minimum viable governance. Any implementation that can emit core safety and provenance headers.

Requirements:

Requirement	Spec Reference	Mandatory Headers
Session management	CRP-SPEC-007	`CRP-Context-Session-Id`, `CRP-Set-Session`, `CRP-Session-Token`
Hallucination risk scoring	CRP-SPEC-005 §7	`CRP-Safety-Hallucination-Risk`, `CRP-Safety-Hallucination-Score`
HMAC chain	CRP-SPEC-011 §2	`CRP-Provenance-HMAC`, `CRP-Provenance-Chain-Integrity`
HTTP 451 halt	CRP-SPEC-002 §13	Must return 451 on CRITICAL risk when `halt-on CRITICAL` is set
Protocol version	CRP-SPEC-002 §4.15	`CRP-Context-Protocol-Version`
Axiom 4 (transparency boundary)	CRP-SPEC-001 §3	CRP headers stripped before LLM provider forwarding

Header count: 7 mandatory headers
DPE requirement: Stage 1 (claim segmentation) + Stage 5 (risk classification) minimum
Use case: Self-hosted SDK deployments, prototype integrations, developer experimentation

1.2 CRP-Standard¶

Target: Production-grade governance. The level required for CRP Comply integration and for any deployment claiming CRP compliance.

Requirements: Everything in CRP-Basic, plus:

Requirement	Spec Reference	Additional Mandatory Headers
Full DPE pipeline (all 13 stages)	CRP-SPEC-005	All `CRP-Safety-*` response headers
Quality Assurance (RQA)	CRP-SPEC-005 §18	`CRP-Quality-Score`, `CRP-Quality-Repetition`, `CRP-Quality-Completeness`, `CRP-Quality-Flow`
Safety Policy enforcement	CRP-SPEC-006	`CRP-Safety-Policy` parsing and enforcement
Context Envelope + Quality Tier	CRP-SPEC-003	`CRP-Context-Quality-Tier`, `CRP-Context-Saturation`, `CRP-Context-ETag`
Compliance headers	CRP-SPEC-010	`CRP-Compliance-EU-AI-Act`, `CRP-Compliance-Audit-Trail-Id`, `CRP-Compliance-Audit-Trail-URI`
Provenance headers	CRP-SPEC-011	`CRP-Provenance-Claim-Count`, `CRP-Provenance-Attribution-Score`, `CRP-Provenance-Fidelity-Score`, `CRP-Provenance-Report-URI`
Continuation support	CRP-SPEC-004	`CRP-Context-Window`, `CRP-Context-Continuation-Id`
Audit trail export	CRP-SPEC-011 §4	NDJSON export of audit events
ETag conditional dispatch	CRP-SPEC-003 §11	`CRP-Context-If-Match` → 304 response

Header count: All 58 headers emitted when applicable
DPE requirement: Full 13-stage pipeline including RQA (Stages 6–9)
Use case: Production deployments, CRP Comply integration, regulatory compliance

1.3 CRP-Full¶

Target: Complete protocol implementation including advanced features. Required for CRP Certification.

Requirements: Everything in CRP-Standard, plus:

Requirement	Spec Reference
All 9 dispatch strategies	CRP-SPEC-008
Multi-agent safety budget propagation	CRP-SPEC-012
Safety Policy inheritance and tightening in agent chains	CRP-SPEC-012 §4
Circuit breaker state transitions	CRP-SPEC-012 §5
Fan-out / fan-in DAG with HMAC merge	CRP-SPEC-004 §6, §7, §9.3
Streaming safety mode (buffer and pass-through)	CRP-SPEC-008 §9
CRP-Safety-Stop-Inject (mid-stream halt)	CRP-SPEC-005 §1.3
OCSF audit trail export	CRP-SPEC-011 §4.2
mTLS client authentication	CRP-SPEC-015 §4.1
CRP Comply real-time streaming integration	CRP-SPEC-011 §5
CRP Visualise session data export	-
Industry-specific Safety Policy profiles	CRP-SPEC-006 §6
Multi-region data residency enforcement	CRP-SPEC-002 §7.7

Use case: CRP Gateway (managed service), CRP Certification program, enterprise deployments

2. Test Vectors¶

2.1 Test Vector Format¶

Each test vector is a JSON object defining:

{
  "test_id": "TV-001",
  "category": "headers | dpe | safety_policy | session | continuation | agent | hmac",
  "conformance_level": "basic | standard | full",
  "description": "Human-readable description of what is being tested",
  "input": { ... },
  "expected_output": { ... },
  "assertions": [ ... ]
}

2.2 Header Test Vectors¶

TV-001: Minimum Basic Header Set¶

{
  "test_id": "TV-001",
  "category": "headers",
  "conformance_level": "basic",
  "description": "Verify that a CRP-Basic implementation emits all 7 mandatory headers on a simple push dispatch",
  "input": {
    "method": "POST",
    "path": "/v1/chat",
    "body": { "messages": [{ "role": "user", "content": "What is the EU AI Act?" }] }
  },
  "assertions": [
    { "header_present": "CRP-Context-Session-Id", "pattern": "^crp_sess_[a-zA-Z0-9]{16,32}$" },
    { "header_present": "CRP-Safety-Hallucination-Risk", "values": ["CRITICAL", "HIGH", "MEDIUM", "LOW"] },
    { "header_present": "CRP-Safety-Hallucination-Score", "range": [0.0, 1.0] },
    { "header_present": "CRP-Provenance-HMAC", "pattern": "^sha256:[a-f0-9]{64}$" },
    { "header_present": "CRP-Provenance-Chain-Integrity", "values": ["VALID", "UNVERIFIED"] },
    { "header_present": "CRP-Set-Session" },
    { "header_present": "CRP-Context-Protocol-Version", "pattern": "^3\\." }
  ]
}

TV-002: Axiom 4 - LLM Provider Header Stripping¶

{
  "test_id": "TV-002",
  "category": "headers",
  "conformance_level": "basic",
  "description": "Verify that no CRP-* headers are forwarded to the LLM provider",
  "input": {
    "method": "POST",
    "path": "/v1/chat",
    "headers": {
      "CRP-Safety-Policy": "halt-on CRITICAL",
      "CRP-Accept-Quality": "S, A",
      "CRP-Session-Token": "eyJ..."
    }
  },
  "assertions": [
    { "provider_request_headers_absent": ["CRP-Safety-Policy", "CRP-Accept-Quality", "CRP-Session-Token"] },
    { "note": "Verify by inspecting the request sent to the LLM provider. No header starting with 'CRP-' may be present." }
  ]
}

TV-003: HTTP 451 on CRITICAL Risk¶

{
  "test_id": "TV-003",
  "category": "headers",
  "conformance_level": "basic",
  "description": "Verify HTTP 451 returned when DPE produces CRITICAL risk and halt-on CRITICAL is set",
  "input": {
    "headers": { "CRP-Safety-Policy": "halt-on CRITICAL" },
    "body": { "messages": [{ "role": "user", "content": "Query designed to produce CRITICAL-risk response from test LLM" }] }
  },
  "assertions": [
    { "http_status": 451 },
    { "header_present": "CRP-Safety-Hallucination-Risk", "value": "CRITICAL" },
    { "header_present": "CRP-Safety-Retry-After" },
    { "header_present": "CRP-Compliance-Audit-Trail-URI" },
    { "body_json_field": "crp_halt_reason" }
  ]
}

TV-004: ETag Conditional Dispatch - 304¶

{
  "test_id": "TV-004",
  "category": "headers",
  "conformance_level": "standard",
  "description": "Verify HTTP 304 returned when CRP-Context-If-Match matches current CKF state and no facts have changed",
  "input": {
    "step_1": { "method": "POST", "path": "/v1/chat", "body": { "messages": [{ "role": "user", "content": "What is ISO 42001?" }] } },
    "step_2": { "method": "POST", "path": "/v1/chat", "headers": { "CRP-Context-If-Match": "<etag_from_step_1>" }, "body": { "messages": [{ "role": "user", "content": "What is ISO 42001?" }] } }
  },
  "assertions": [
    { "step_1_header_present": "CRP-Context-ETag" },
    { "step_2_http_status": 304 },
    { "step_2_header_present": "CRP-Context-ETag", "equals": "<same_as_step_1>" },
    { "step_2_header_present": "CRP-Context-Cache-Status", "value": "HIT" }
  ]
}

2.3 DPE Test Vectors¶

TV-010: Fabrication Detection¶

{
  "test_id": "TV-010",
  "category": "dpe",
  "conformance_level": "standard",
  "description": "Verify DPE detects a fabricated entity in an LLM response",
  "input": {
    "envelope_facts": [
      { "fact_id": "f1", "content": "The EU AI Act was adopted in 2024 by the European Parliament." },
      { "fact_id": "f2", "content": "The Act classifies AI systems into four risk levels." }
    ],
    "llm_response": "The EU AI Act was adopted in 2024. According to Commissioner Hans Müller, the Act classifies AI systems into four risk levels.",
    "note": "Hans Müller is a fabricated entity - no such commissioner exists in the envelope or as a known public figure."
  },
  "assertions": [
    { "header": "CRP-Safety-Fabrications", "value_gte": 1 },
    { "dpe_report_field": "fabrication_count", "value_gte": 1 },
    { "dpe_report_contains_entity": "Hans Müller" }
  ]
}

TV-011: Distortion Detection - Number Changed¶

{
  "test_id": "TV-011",
  "category": "dpe",
  "conformance_level": "standard",
  "description": "Verify DPE detects when a number from the source is changed in the response",
  "input": {
    "envelope_facts": [
      { "fact_id": "f1", "content": "Revenue increased by 15% in Q3 2025." }
    ],
    "llm_response": "Revenue increased by 25% in Q3 2025."
  },
  "assertions": [
    { "header": "CRP-Safety-Distortions", "contains": "NUMBER_CHANGED" },
    { "dpe_report_field": "distortion_count", "value_gte": 1 }
  ]
}

TV-012: Cross-Window Contradiction¶

{
  "test_id": "TV-012",
  "category": "dpe",
  "conformance_level": "standard",
  "description": "Verify DPE Stage 6 detects contradiction between current response and prior window",
  "input": {
    "prior_window_response": "The company's revenue declined by 3% in the fiscal year.",
    "current_response": "The company experienced strong revenue growth of 12% in the fiscal year."
  },
  "assertions": [
    { "header": "CRP-Safety-Contradictions", "contains": "cross-window" },
    { "dpe_report_field": "cross_window_contradictions", "length_gte": 1 }
  ]
}

TV-013: Repetition Detection¶

{
  "test_id": "TV-013",
  "category": "dpe",
  "conformance_level": "standard",
  "description": "Verify DPE Stage 7 detects severe repetition between windows",
  "input": {
    "prior_window_response": "The EU AI Act classifies AI systems into four risk levels: unacceptable, high, limited, and minimal. Each level carries different regulatory obligations.",
    "current_response": "The EU AI Act classifies AI systems into four risk levels: unacceptable, high, limited, and minimal. The obligations vary by level. Each risk category carries different regulatory requirements."
  },
  "assertions": [
    { "header": "CRP-Quality-Repetition", "contains": "SIGNIFICANT" },
    { "note": "Semantic overlap > 0.50 expected due to near-verbatim content" }
  ]
}

2.4 Safety Policy Test Vectors¶

TV-020: Policy Parsing - Valid¶

{
  "test_id": "TV-020",
  "category": "safety_policy",
  "conformance_level": "standard",
  "description": "Verify gateway correctly parses a complex Safety Policy",
  "input": {
    "header": "CRP-Safety-Policy: default-src context; halt-on CRITICAL; warn-on HIGH; require-grounding 0.75; block-ungrounded; upgrade-on-risk reflexive; report-uri https://comply.crprotocol.io/reports"
  },
  "assertions": [
    { "parsed_directive": "default-src", "value": ["context"] },
    { "parsed_directive": "halt-on", "value": "CRITICAL" },
    { "parsed_directive": "warn-on", "value": "HIGH" },
    { "parsed_directive": "require-grounding", "value": 0.75 },
    { "parsed_directive": "block-ungrounded", "value": true },
    { "parsed_directive": "upgrade-on-risk", "value": "reflexive" },
    { "parsed_directive": "report-uri", "value": "https://comply.crprotocol.io/reports" }
  ]
}

TV-021: Policy Parsing - Malformed Rejection¶

{
  "test_id": "TV-021",
  "category": "safety_policy",
  "conformance_level": "standard",
  "description": "Verify gateway rejects malformed Safety Policy with unknown directive",
  "input": {
    "header": "CRP-Safety-Policy: default-src context; halt-on CRITICAL; allow-hallucination"
  },
  "assertions": [
    { "http_status": 400 },
    { "body_contains": "unknown directive" },
    { "note": "allow-hallucination is not a valid directive - gateway MUST reject, not silently ignore" }
  ]
}

TV-022: Policy Inheritance - Tightening Accepted¶

{
  "test_id": "TV-022",
  "category": "safety_policy",
  "conformance_level": "full",
  "description": "Verify child agent can tighten parent's Safety Policy",
  "input": {
    "parent_policy": "halt-on CRITICAL; require-grounding 0.75",
    "child_policy": "halt-on HIGH; require-grounding 0.85; block-fabrication"
  },
  "assertions": [
    { "http_status": 200 },
    { "header": "CRP-Safety-Policy-Applied", "contains": "halt-on HIGH" }
  ]
}

TV-023: Policy Inheritance - Relaxation Rejected¶

{
  "test_id": "TV-023",
  "category": "safety_policy",
  "conformance_level": "full",
  "description": "Verify child agent cannot relax parent's Safety Policy",
  "input": {
    "parent_policy": "halt-on CRITICAL; require-grounding 0.75",
    "child_policy": "warn-on CRITICAL; require-grounding 0.50"
  },
  "assertions": [
    { "http_status": 403 },
    { "body_json_field": "error", "value": "safety_policy_inheritance_violation" }
  ]
}

2.5 HMAC Chain Test Vectors¶

TV-030: Chain Verification - Valid¶

{
  "test_id": "TV-030",
  "category": "hmac",
  "conformance_level": "basic",
  "description": "Verify HMAC chain is valid across 3 windows",
  "input": {
    "session_hmac_key": "hex:0123456789abcdef...",
    "windows": [
      { "window_number": 1, "content_hash": "sha256:aaa...", "dpe_hash": "sha256:bbb...", "timestamp": "2026-05-25T10:00:00Z" },
      { "window_number": 2, "content_hash": "sha256:ccc...", "dpe_hash": "sha256:ddd...", "timestamp": "2026-05-25T10:01:00Z" },
      { "window_number": 3, "content_hash": "sha256:eee...", "dpe_hash": "sha256:fff...", "timestamp": "2026-05-25T10:02:00Z" }
    ]
  },
  "assertions": [
    { "window_1_hmac": "sha256:<computed>", "note": "Previous HMAC is empty string for root" },
    { "window_2_hmac": "sha256:<computed>", "note": "Chains from window_1_hmac" },
    { "window_3_hmac": "sha256:<computed>", "note": "Chains from window_2_hmac" },
    { "header": "CRP-Provenance-Chain-Integrity", "value": "VALID" }
  ]
}

TV-031: Chain Verification - Broken (Tampered Event)¶

{
  "test_id": "TV-031",
  "category": "hmac",
  "conformance_level": "basic",
  "description": "Verify chain integrity detects tampering when a window's content hash is modified after signing",
  "input": {
    "same_as": "TV-030",
    "modification": "window_2.content_hash changed from sha256:ccc... to sha256:zzz..."
  },
  "assertions": [
    { "header": "CRP-Provenance-Chain-Integrity", "value": "BROKEN" },
    { "note": "Window 2's recomputed HMAC will not match stored HMAC because content_hash was changed" }
  ]
}

2.6 Session Token Test Vectors¶

TV-040: Token Signature Validation - Valid¶

{
  "test_id": "TV-040",
  "category": "session",
  "conformance_level": "basic",
  "description": "Verify gateway accepts a correctly signed session token",
  "input": {
    "master_key": "hex:fedcba9876543210...",
    "session_id": "crp_sess_7f3a9bc2d4e1f083",
    "token_payload": { "v": "3.0.0", "sid": "crp_sess_7f3a9bc2d4e1f083", "win": 1, "exp": 9999999999 }
  },
  "assertions": [
    { "signature_valid": true },
    { "session_resumed": true }
  ]
}

TV-041: Token Expiry Rejection¶

{
  "test_id": "TV-041",
  "category": "session",
  "conformance_level": "basic",
  "description": "Verify gateway rejects an expired session token",
  "input": {
    "token_payload": { "exp": 1000000000 },
    "note": "exp is in the past"
  },
  "assertions": [
    { "http_status": 401 }
  ]
}

2.7 Safety Budget Test Vectors¶

TV-050: Budget Depletion Across Calls¶

{
  "test_id": "TV-050",
  "category": "agent",
  "conformance_level": "full",
  "description": "Verify safety budget decrements correctly across multiple calls with different risk levels",
  "input": {
    "calls": [
      { "risk_level": "LOW", "expected_budget_after": 1.00 },
      { "risk_level": "MEDIUM", "expected_budget_after": 0.95 },
      { "risk_level": "HIGH", "expected_budget_after": 0.80 },
      { "risk_level": "HIGH", "expected_budget_after": 0.65 },
      { "risk_level": "CRITICAL", "expected_budget_after": 0.30 },
      { "risk_level": "MEDIUM", "expected_budget_after": 0.25 }
    ]
  },
  "assertions": [
    { "call_6_header": "CRP-Agent-Safety-Budget", "value_lte": 0.25 },
    { "call_6_header": "CRP-Safety-Budget-Warning", "value": "caution" },
    { "call_6_header": "CRP-Safety-Oversight-Mode", "note": "Not yet forced to human-review (budget > 0.10)" }
  ]
}

TV-051: Budget Depletion → Forced Halt¶

{
  "test_id": "TV-051",
  "category": "agent",
  "conformance_level": "full",
  "description": "Verify session halts when safety budget depletes to ≤ 0.10",
  "input": {
    "budget_before_call": 0.12,
    "call_risk_level": "HIGH"
  },
  "assertions": [
    { "http_status": 451 },
    { "body_json_field": "crp_halt_reason", "value": "SAFETY_BUDGET_DEPLETED" },
    { "note": "Budget would be 0.12 - 0.15 = -0.03, which is ≤ 0.10 → halt" }
  ]
}

3. Certification Program¶

3.1 CRP-Compliant Certification¶

The "CRP-Compliant" certification is a commercial program operated by AutoCyber AI Pty Ltd. Third-party AI products and platforms can obtain certification to demonstrate they implement the CRP protocol correctly.

3.2 Certification Levels¶

Level	Requires	Badge	Annual Fee
CRP-Basic Certified	Pass all Basic test vectors	"CRP-Basic Compliant"	$5,000
CRP-Standard Certified	Pass all Basic + Standard test vectors	"CRP-Standard Compliant"	$10,000
CRP-Full Certified	Pass all Basic + Standard + Full test vectors	"CRP-Full Certified Partner"	$25,000

3.3 Certification Process¶

Application: Vendor submits application describing their implementation
Self-Assessment: Vendor runs the CRP Conformance Test Suite against their implementation and submits results
Independent Verification: AutoCyber AI (or a designated audit partner) runs the test suite independently against the vendor's implementation
Gap Remediation: Any failing test vectors are documented; vendor has 90 days to remediate
Certification Issued: On passing all test vectors for the target level
Annual Renewal: Re-run test suite annually; certification lapses if not renewed

3.4 Test Suite Distribution¶

The test suite is: - Published as an open-source test runner at github.com/crprotocol/conformance-tests - Runnable against any CRP-compatible HTTP endpoint - Includes all test vectors defined in this document plus additional edge-case vectors - Updated with each CRP protocol version release

3.5 IETF Interoperability Requirement¶

For IETF Proposed Standard status, at least two independent implementations MUST pass the CRP-Standard conformance suite. The test results MUST be published in the IETF implementation report.

4. Interoperability Testing¶

4.1 Cross-Implementation Tests¶

When two CRP implementations exist (e.g., the reference Python implementation and a third-party Go implementation):

Client A → Gateway B: Client using implementation A sends requests to gateway running implementation B. All test vectors must pass.
Session Token Relay: Token issued by Gateway A must be validatable by Gateway B (requires shared master key or compatible key derivation).
HMAC Chain Verification: Chain generated by Gateway A must be verifiable by Gateway B.
Safety Policy Portability: Policy parsed by Gateway A must produce identical enforcement behaviour in Gateway B.

4.2 Provider Compatibility Tests¶

CRP implementations MUST be tested against: - OpenAI API (GPT-4o, GPT-4o-mini) - Anthropic API (Claude Sonnet 4, Claude Haiku) - Google Gemini API - Ollama (local models) - Azure OpenAI

Each provider test verifies: - Axiom 4 compliance (no CRP headers leaked to provider) - Correct tokenizer selection for the target model - DPE operates correctly on provider-specific response formats

5. Conformance Statement Template¶

Implementations MUST publish a conformance statement:

# CRP Conformance Statement

**Product:** [Product Name]
**Version:** [Version]
**CRP Protocol Version:** 3.0.0
**Conformance Level:** [Basic / Standard / Full]
**Test Suite Version:** [Version]
**Test Date:** [Date]
**Test Results:**
  - Basic vectors: [X/Y passed]
  - Standard vectors: [X/Y passed] (if applicable)
  - Full vectors: [X/Y passed] (if applicable)
**Known Deviations:** [List any test vectors that are not applicable or intentionally deviated from, with justification]
**Contact:** [Vendor contact for interoperability testing]

6. References¶

All CRP-SPEC-001 through CRP-SPEC-013, CRP-SPEC-015
IETF BCP 9 - The Internet Standards Process (interoperability requirement)
OASIS SARIF v2.1.0 - Test output format compatibility