CRP-SPEC-005: Decision Provenance Engine (DPE) Specification¶

Document: CRP-SPEC-005
Title: Context Relay Protocol (CRP) - Decision Provenance Engine
Version: 3.0.0
Status: Draft
Author: Constantinos Vidiniotis, AutoCyber AI Pty Ltd
Contact: contact@crprotocol.io
Repository: https://github.com/crprotocol/spec
Date: 2026-05-25
License: CC BY 4.0 (specification text)
Prerequisites: CRP-SPEC-001 (Core), CRP-SPEC-002 (Headers), CRP-SPEC-003 (Envelope), CRP-SPEC-004 (Continuation)

Abstract¶

This document specifies the Decision Provenance Engine (DPE) - CRP's post-generation analysis pipeline that runs after every LLM dispatch. The DPE performs claim segmentation, attribution analysis, fidelity verification, entailment scoring, fabrication detection, contradiction analysis, omission detection, and cross-window coherence verification. It produces the hallucination risk score, the provenance record, and the quality signals emitted as CRP response headers.

Critically, this document also specifies the Response Quality Assurance (RQA) subsystem - the mechanisms ensuring that multi-window responses are complete, non-redundant, coherently flowing, and stylistically consistent. The RQA is the layer that transforms a chain of independent LLM calls into what reads as a single, unified, high-quality response.

Table of Contents¶

Introduction
Pipeline Architecture
Stage 1: Claim Segmentation
Stage 2: Attribution Analysis
Stage 3: Fidelity Verification
Stage 4: Entailment Scoring
Stage 5: Risk Classification
Stage 6: Cross-Window Coherence
Stage 7: Repetition Detection
Stage 8: Completeness Verification
Stage 9: Flow & Continuity Analysis
Stage 10: Omission Detection
Stage 11: PII Detection
Stage 12: Regulatory Classification
Stage 13: Report Generation
Composite Risk Scoring Algorithm
Regulatory Amplifiers
Response Quality Assurance (RQA)
Re-Dispatch Protocol
Header Outputs
Security Considerations
References

1. Introduction¶

1.1 The Problem: LLMs Produce Unverified Output¶

An LLM generates text that appears authoritative but carries no intrinsic guarantee of accuracy, completeness, or consistency. Without post-generation verification:

Fabricated entities (names, dates, statistics) pass undetected
Source facts are distorted (numbers changed, negations flipped)
Responses contradict earlier windows in the same session
Multi-window responses repeat content verbatim
Continuation responses lack coherent flow - each window reads as an isolated answer
Material omissions go undetected - the model silently ignores critical context

1.2 The DPE Solution¶

The DPE is a 13-stage pipeline that runs after every LLM dispatch and before the response is released to the client. It produces:

A per-claim provenance record (which claims are grounded, which are fabricated)
A composite hallucination risk score (0.0 – 1.0, classified as CRITICAL/HIGH/MEDIUM/LOW)
A quality assessment covering coherence, repetition, completeness, and flow
A regulatory classification mapping the output to EU AI Act, GDPR, and NIST controls
The HMAC audit record for this window

1.3 What's New in v3.0¶

CRP v3.0 introduces the Response Quality Assurance (RQA) subsystem - Stages 6 through 9. These stages specifically address the quality problems that emerge in multi-window continuation:

Problem	Stage	What It Does
Cross-window contradictions	Stage 6	Detects when Window N contradicts Window N-1
Content repetition	Stage 7	Detects when Window N repeats content from earlier windows
Incomplete answers	Stage 8	Detects when the query was not fully answered across all windows
Incoherent flow	Stage 9	Detects when Window N reads as a disjointed fragment rather than a continuation

These stages transform CRP from a safety-verification protocol into a quality-assurance protocol - ensuring that what the user receives is not just safe but genuinely good.

2. Pipeline Architecture¶

LLM Response
     │
     ▼
┌──────────────────────────────────────────────────────┐
│  PROVENANCE STAGES (verify truthfulness)              │
│                                                        │
│  Stage 1: Claim Segmentation                          │
│  Stage 2: Attribution Analysis                        │
│  Stage 3: Fidelity Verification                       │
│       3a: Fabrication Detection                       │
│       3b: Distortion Detection                        │
│       3c: Contradiction Detection                     │
│  Stage 4: Entailment Scoring                          │
│  Stage 5: Risk Classification                         │
├──────────────────────────────────────────────────────┤
│  QUALITY STAGES (verify quality & coherence)          │
│  ★ NEW IN v3.0                                        │
│                                                        │
│  Stage 6: Cross-Window Coherence                      │
│  Stage 7: Repetition Detection                        │
│  Stage 8: Completeness Verification                   │
│  Stage 9: Flow & Continuity Analysis                  │
├──────────────────────────────────────────────────────┤
│  COMPLIANCE STAGES (classify & record)                │
│                                                        │
│  Stage 10: Omission Detection                         │
│  Stage 11: PII Detection                              │
│  Stage 12: Regulatory Classification                  │
│  Stage 13: Report Generation + HMAC                   │
└──────────────────────────────────────────────────────┘
     │
     ▼
CRP Response Headers + Audit Record

The pipeline is sequential. Each stage can trigger a re-dispatch if the computed risk or quality score exceeds the policy threshold and an upgrade-on-risk directive is set (see §19).

3. Stage 1: Claim Segmentation¶

3.1 Purpose¶

Decompose the LLM response into discrete factual claims for individual verification. A "claim" is an atomic assertion that can be independently evaluated as true or false.

3.2 Algorithm¶

Input: LLM response text
Output: List of Claim objects

Claim {
  claim_id:       string      // Unique identifier
  text:           string      // The claim text
  sentence_index: integer     // Position in response (sentence number)
  claim_type:     enum        // FACTUAL | OPINION | PROCEDURAL | META
  specificity:    float       // 0.0 – 1.0 (how specific/verifiable)
  entities:       Entity[]    // Named entities within the claim
}

3.3 Claim Types¶

Type	Definition	Verifiable?
`FACTUAL`	Assertion about the world that can be checked against evidence	Yes
`OPINION`	Subjective evaluation or judgment	No - flagged, not scored
`PROCEDURAL`	Instruction or process description	Partially - against source procedures
`META`	Statement about the response itself ("As mentioned above...")	Used for coherence analysis

3.4 Implementation¶

Claim segmentation uses an NLI-trained classifier to split sentences into atomic claims. Compound sentences ("X is true and Y happened in 2024") are decomposed into separate claims. The classifier is calibrated to over-segment rather than under-segment - it is better to check two simple claims than miss a fabrication in a compound sentence.

3.5 Output¶

claims_detected: 47
  FACTUAL: 34
  OPINION: 6
  PROCEDURAL: 5
  META: 2

Header output: CRP-Provenance-Claim-Count: 47

4. Stage 2: Attribution Analysis¶

4.1 Purpose¶

For each FACTUAL claim, determine whether it is grounded in the Context Envelope or drawn from the LLM's parametric memory.

4.2 Attribution Types¶

Type	Definition	Risk Contribution
`CONTEXT_GROUNDED`	Claim is semantically supported by ≥1 fact in the envelope	Low - this is the ideal
`PARAMETRIC`	Claim has no matching fact in envelope; drawn from model weights	Medium - may be accurate but unverifiable
`MIXED`	Claim combines envelope content with parametric additions	Medium - partial grounding
`UNVERIFIABLE`	Claim is specific (names a number, date, entity) but matches nothing	High - possible fabrication

4.3 Matching Algorithm¶

For each FACTUAL claim:

Compute claim embedding using the same model as CKF
Search the envelope facts for cosine similarity ≥ 0.75
If match found: classify as CONTEXT_GROUNDED
If no match but claim is generic (specificity < 0.30): classify as PARAMETRIC
If no match and claim is specific (specificity ≥ 0.30): classify as UNVERIFIABLE
If partial match (similarity 0.60 – 0.74): classify as MIXED

4.4 Grounding Percentage¶

grounding_pct = count(CONTEXT_GROUNDED) / count(FACTUAL claims)

4.5 Output¶

attribution_dominant: CONTEXT_GROUNDED
grounding_pct: 0.912
breakdown:
  CONTEXT_GROUNDED: 31 / 34
  PARAMETRIC: 1 / 34
  MIXED: 2 / 34
  UNVERIFIABLE: 0 / 34

Header outputs:

CRP-Safety-Attribution: CONTEXT_GROUNDED
CRP-Safety-Grounding-Pct: 0.912
CRP-Provenance-Attribution-Score: 0.912

5. Stage 3: Fidelity Verification¶

5.1 Purpose¶

For claims classified as CONTEXT_GROUNDED, verify that the claim accurately represents its source fact - not just that a source exists, but that the claim faithfully reproduces the source's content.

5.2 Sub-Stage 3a: Fabrication Detection¶

Definition: A fabrication is a named entity (person, organisation, date, statistic, citation, URL) in the response that has no basis in the envelope or any verifiable source.

Algorithm: 1. Extract all named entities from the response 2. For each entity, search the envelope for a matching entity 3. Entities not found in the envelope AND not identifiable as common-knowledge entities → flagged as fabricated 4. Numbers, dates, and statistics receive extra scrutiny - each must trace to a source fact

Output: CRP-Safety-Fabrications: 0

5.3 Sub-Stage 3b: Distortion Detection¶

Definition: A distortion is a claim that references a real source fact but misrepresents it.

Distortion types:

Type	Example
`NUMBER_CHANGED`	Source: "revenue was $4.2M" → Response: "revenue was $4.8M"
`NEGATION_FLIP`	Source: "the policy does not apply" → Response: "the policy applies"
`DATE_SHIFTED`	Source: "enacted in 2024" → Response: "enacted in 2023"
`ENTITY_SUBSTITUTED`	Source: "according to NIST" → Response: "according to ISO"
`MAGNITUDE_ALTERED`	Source: "increased by 15%" → Response: "increased by 50%"
`CONTEXT_STRIPPED`	Source: "conditionally approved pending review" → Response: "approved"

Algorithm: For each CONTEXT_GROUNDED claim, align it with its matched source fact. Apply NLI contradiction detection. For numeric claims, extract and compare values directly.

Output: CRP-Safety-Distortions: 1; types=NUMBER_CHANGED

5.4 Sub-Stage 3c: Contradiction Detection (Intra-Window)¶

Definition: An intra-window contradiction is two claims within the same response that contradict each other.

Algorithm: For every pair of FACTUAL claims, run NLI entailment/contradiction classification. Pairs classified as CONTRADICTION → flagged.

Output: Contributes to CRP-Safety-Contradictions

5.5 Fidelity Score¶

fidelity_score = 1.0 - (
  (fabrication_count × 0.30) +
  (distortion_count × 0.20) +
  (contradiction_count × 0.15)
) / max(1, factual_claim_count)

Clamped to [0.0, 1.0].

Header output: CRP-Provenance-Fidelity-Score: 0.978

6. Stage 4: Entailment Scoring¶

6.1 Purpose¶

Measure the degree to which the entire response is logically entailed by the Context Envelope, using an NLI cross-encoder model.

6.2 Algorithm¶

Concatenate all envelope facts into a single premise document
Treat the LLM response as the hypothesis
Run an NLI cross-encoder (e.g., DeBERTa-v3-large-MNLI) to produce:
P(entailment)
P(neutral)
P(contradiction)
Entailment score = P(entailment)

For responses longer than the cross-encoder's max sequence length, chunk the response and compute per-chunk entailment scores, then take the minimum (weakest-link principle).

6.3 Output¶

entailment_score: 0.912
contradiction_probability: 0.023
neutral_probability: 0.065

Header output: CRP-Safety-Entailment-Score: 0.912

7. Stage 5: Risk Classification¶

7.1 Purpose¶

Compute the composite hallucination risk score from the four DPE signals and classify into CRITICAL/HIGH/MEDIUM/LOW.

7.2 Composite Score Formula¶

The composite hallucination risk is a weighted linear combination of four normalised DPE signals:

attribution_signal  = 1.0 - grounding_pct          (signal: claims not grounded in envelope)
fidelity_signal     = 1.0 - fidelity_score          (signal: low semantic alignment to sources)
entailment_signal   = 1.0 - entailment_score        (signal: NLI contradiction with sources)
specificity_signal  = unverifiable_pct              (signal: claims that cannot be verified)

composite = w₁ × attribution_signal
          + w₂ × fidelity_signal
          + w₃ × entailment_signal
          + w₄ × specificity_signal

Where unverifiable_pct = count(UNVERIFIABLE) / max(1, count(FACTUAL)).

Implementation note. The reference weights (w₁, w₂, w₃, w₄) are calibrated against the CRP benchmark corpus to maximise alignment with human-graded hallucination labels. Exact reference weights are not normative and are part of the reference implementation calibration set; conformant implementations MAY tune these weights against their own evaluation data provided they (a) sum to 1.0, (b) are documented in the conformance report, and (c) satisfy CRP-SPEC-014 §3 calibration criteria.

7.3 Regulatory Amplifiers¶

After computing the base composite score, regulatory amplifiers are applied to reflect heightened risk in regulated domains (see §17):

Amplifier trigger	Effect	Rationale
GDPR PII detected	Score uplift (substantial)	Article 22 - automated decision risk
EU AI Act high-risk domain (Annex III)	Score uplift (significant)	Articles 9–15 high-risk requirements
Financial or medical domain	Score uplift (moderate)	Sectoral regulator risk thresholds
Agentic loop depth exceeds threshold	Score uplift (moderate)	CRP-SPEC-012 multi-agent safety
Cross-window contradiction detected	Score uplift (moderate)	Session-level coherence violation

The final composite is capped at 1.0. The specific multiplier values for each amplifier are part of the reference implementation calibration set and are not normative; conformant implementations MAY tune these values provided they preserve the relative ordering of severities and document the configuration in the conformance report.

7.4 Classification Thresholds¶

Risk Level	Score Range	Protocol Action
`CRITICAL`	≥ 0.70	HTTP 451 if `halt-on CRITICAL`; report-uri webhook
`HIGH`	≥ 0.45	Strategy upgrade if `upgrade-on-risk`; safety budget -0.15
`MEDIUM`	≥ 0.20	Pass with warning headers; budget -0.05
`LOW`	< 0.20	Pass clean; budget unchanged

7.5 Output¶

CRP-Safety-Hallucination-Risk: LOW
CRP-Safety-Hallucination-Score: 0.14

8. Stage 6: Cross-Window Coherence¶

8.1 Purpose¶

★ NEW IN v3.0. Detect contradictions between the current response and prior windows in the session. A response that says "revenue increased 20%" when a previous window said "revenue decreased 5%" is a cross-window contradiction that single-window DPE cannot catch.

8.2 Algorithm¶

Input: Current response + summary/content of all prior windows in the session (stored in the WindowDAG).

Step 6.1 - Extract key assertions: From the current response, extract all FACTUAL claims that reference entities, numbers, dates, or states also mentioned in prior windows.

Step 6.2 - Cross-reference: For each such claim, find the corresponding claim in the prior window(s). If the prior window's claim was summarised (per CRP-SPEC-004 §12), use the summary - the summary was verified for fidelity against the original.

Step 6.3 - NLI contradiction check: Run NLI on each pair (current_claim, prior_claim). If P(contradiction) > 0.75 → flag as cross-window contradiction.

Step 6.4 - Severity classification:

Contradiction Type	Severity	Example
Numeric contradiction	CRITICAL	W1: "$4.2M revenue" vs W2: "$5.1M revenue"
Logical contradiction	HIGH	W1: "policy approved" vs W2: "policy rejected"
Temporal contradiction	HIGH	W1: "launched in 2024" vs W2: "launched in 2025"
Stance contradiction	MEDIUM	W1: "generally positive" vs W2: "largely negative"
Minor inconsistency	LOW	W1: "about 15%" vs W2: "approximately 16%"

8.3 Impact on Risk Score¶

Cross-window contradictions apply a regulatory amplifier to the composite score (see §17). If a CRITICAL-severity contradiction is detected, the response is flagged for re-dispatch regardless of the overall composite score.

8.4 Output¶

CRP-Safety-Contradictions: 1; scope=cross-window

If contradictions found, the DPE report includes: - The contradicting claims (current and prior) - The window IDs involved - The contradiction severity

9. Stage 7: Repetition Detection¶

9.1 Purpose¶

★ NEW IN v3.0. Detect when the current response repeats content from prior windows - verbatim or near-verbatim. Repetition wastes the user's context and indicates the LLM is not building on prior output but re-generating it.

9.2 Algorithm¶

Step 7.1 - N-gram overlap: Compute 4-gram overlap between the current response and all prior window responses in the session.

overlap_ratio = count(shared_4grams) / count(current_4grams)

Step 7.2 - Semantic overlap: Compute sentence-level cosine similarity between each sentence in the current response and all sentences in prior responses.

for each sentence_current in current_response:
    max_sim = max(cosine_sim(sentence_current, sentence_prior) for sentence_prior in prior_responses)
    if max_sim > 0.92:
        repeated_sentences.append(sentence_current)

semantic_repetition_ratio = len(repeated_sentences) / len(current_sentences)

Step 7.3 - Classification:

Repetition Level	N-gram Overlap	Semantic Overlap	Action
`NONE`	< 0.05	< 0.10	Pass
`MINOR`	0.05 – 0.15	0.10 – 0.25	Log warning
`SIGNIFICANT`	0.15 – 0.30	0.25 – 0.50	Flag in quality report; recommend re-dispatch
`SEVERE`	> 0.30	> 0.50	Trigger re-dispatch with anti-repetition instruction

9.3 Anti-Repetition Re-Dispatch¶

When SEVERE repetition is detected and upgrade-on-risk is set in the Safety Policy, the gateway triggers re-dispatch with an augmented system prompt:

System prompt addition:
"IMPORTANT: The following content has already been provided to the user in prior responses.
DO NOT repeat this content. Build upon it, extend it, or address different aspects.

[Prior response summary - flagged repeated sections marked]

Continue from where the prior response ended. Provide NEW information only."

The re-dispatched response goes through the full DPE pipeline again, including another repetition check.

9.4 Output¶

A new response header for quality:

CRP-Quality-Repetition: NONE
CRP-Quality-Repetition: SIGNIFICANT; overlap=0.28

10. Stage 8: Completeness Verification¶

10.1 Purpose¶

★ NEW IN v3.0. Verify that the response (or the cumulative response across all windows in the session) actually answers the user's query - completely. An LLM may produce a confident, well-grounded response that addresses only half the question.

10.2 Algorithm¶

Step 8.1 - Query decomposition: Decompose the user's query into constituent information needs (same decomposition used in CRP-SPEC-003 §5.2).

Query: "Compare EU AI Act and GDPR approaches to automated decision-making 
        and list the compliance requirements for each"

Sub-queries:
  1. EU AI Act approach to automated decision-making
  2. GDPR approach to automated decision-making
  3. Comparison between the two approaches
  4. EU AI Act compliance requirements
  5. GDPR compliance requirements

Step 8.2 - Coverage check: For each sub-query, check whether the response (cumulative across all session windows) contains content that addresses it.

for each sub_query:
    max_relevance = max(
        cosine_sim(sub_query_embedding, sentence_embedding)
        for sentence in all_session_responses
    )
    if max_relevance > 0.70:
        covered.append(sub_query)
    else:
        uncovered.append(sub_query)

completeness_score = len(covered) / len(sub_queries)

Step 8.3 - Classification:

Completeness	Score	Action
`COMPLETE`	≥ 0.90	Pass
`MOSTLY_COMPLETE`	0.70 – 0.89	Log; if continuation windows remain, recommend continuation
`PARTIAL`	0.40 – 0.69	Flag in quality report; recommend continuation or broader query
`INCOMPLETE`	< 0.40	If continuation windows remain: auto-continue. If exhausted: warn client

10.3 Auto-Continuation on Incompleteness¶

When completeness is PARTIAL or INCOMPLETE AND the session has remaining continuation windows AND CRP-Context-Cache does not contain no-cache:

Gateway automatically issues a continuation window focused on the uncovered sub-queries
The continuation's envelope prioritises facts related to uncovered sub-queries
The system prompt includes: "The prior response addressed [covered topics]. Now address: [uncovered topics]."
This auto-continuation is transparent to the client - they receive the combined response

The auto-continuation is logged as a COMPLETENESS_CONTINUATION event in the audit trail.

10.4 Output¶

CRP-Quality-Completeness: COMPLETE; sub-queries=5/5
CRP-Quality-Completeness: PARTIAL; sub-queries=3/5; uncovered=eu-ai-act-requirements,gdpr-requirements

11. Stage 9: Flow & Continuity Analysis¶

11.1 Purpose¶

★ NEW IN v3.0. Ensure that multi-window responses read as a single coherent document, not as N independent answers stapled together. This is the hardest quality problem in multi-window AI and the area where CRP provides the most novel contribution.

11.2 The Flow Problem¶

Without flow analysis, a 3-window response might read:

[Window 1]: "The EU AI Act classifies AI systems into four risk levels..."
[Window 2]: "The EU AI Act introduces a risk-based approach to AI regulation. 
             Systems are classified into four categories..."  ← REPEATS W1
[Window 3]: "In conclusion, the EU AI Act..."  ← ABRUPT, no bridge from W2

With flow analysis, it should read:

[Window 1]: "The EU AI Act classifies AI systems into four risk levels..."
[Window 2]: "Building on this classification framework, the Act defines specific 
             obligations for each risk level..."  ← EXTENDS W1
[Window 3]: "These obligations translate into concrete compliance requirements. 
             Specifically..."  ← BRIDGES naturally from W2

11.3 Flow Signals¶

The DPE evaluates four flow signals for continuation windows (window_number > 1):

Signal 9a - Opening coherence: Does the response's opening sentence connect to the prior window's closing content?

opening_coherence = cosine_sim(
    current_response.first_2_sentences.embedding,
    prior_response.last_2_sentences.embedding
)

A low opening coherence (< 0.40) indicates the response starts as if the prior window didn't exist.

Signal 9b - Topic continuity: Does the response maintain or logically extend the topic trajectory?

prior_topics = extract_topics(prior_response)  // Top-5 topic clusters
current_topics = extract_topics(current_response)
topic_continuity = jaccard_similarity(prior_topics, current_topics)

A topic_continuity of 0.0 indicates a complete topic shift (may be valid for a new sub-query; context-dependent).

Signal 9c - Register consistency: Does the response maintain the same formality, style, and person as the prior window?

register_signals:
  - sentence_length_mean (current vs prior - should be within 20%)
  - vocabulary_complexity (current vs prior - should be within 15%)
  - passive_voice_ratio (current vs prior - should be within 10%)
  - personal_pronoun_usage (current vs prior - should match)

Large register shifts indicate stylistic incoherence.

Signal 9d - Transitional markers: Does the response contain explicit continuation markers ("Furthermore", "Building on this", "As noted previously", "The next aspect")?

has_transition = any(
    sentence starts with transitional_phrase
    for sentence in current_response.first_3_sentences
)

11.4 Flow Score¶

flow_score = (
    opening_coherence × 0.35 +
    topic_continuity × 0.25 +
    register_consistency × 0.20 +
    transition_presence × 0.20
)

11.5 Flow Remediation¶

When flow_score < 0.50 and the session has used continuation:

Option A - Prompt augmentation (default): Re-dispatch with a system prompt that explicitly instructs continuation flow:

System prompt addition:
"This is a continuation of a prior response. The prior response ended with:

'[last 3 sentences of prior response]'

Continue naturally from that point. Do NOT re-introduce the topic.
Do NOT repeat content from the prior response.
Use transitional language to bridge from the prior content.
Maintain the same style, formality, and level of detail."

Option B - Post-processing stitch (if re-dispatch would exceed budget): The gateway generates a 1-2 sentence bridge between the prior window's end and the current window's beginning, using a dedicated "stitch call" with temperature 0.2:

Stitch prompt:
"Write a 1-2 sentence bridge connecting these two passages:

PASSAGE A ends with: '[last 2 sentences of Window N-1]'
PASSAGE B begins with: '[first 2 sentences of Window N]'

The bridge should create a natural transition. Be concise."

The stitch is inserted between the windows. The stitch call is logged as a FLOW_STITCH event.

11.6 Output¶

CRP-Quality-Flow: 0.82
CRP-Quality-Flow: 0.34; remediation=prompt-augmented

12. Stage 10: Omission Detection¶

12.1 Purpose¶

Detect when the Context Envelope contained critical information that the LLM ignored. An omission is different from incompleteness (Stage 8) - omission means the relevant facts were in the envelope but the model didn't use them. Incompleteness means the query wasn't fully answered.

12.2 Algorithm¶

For each fact in the envelope with importance_weight ≥ 0.70:

Search the response for semantic coverage of this fact
If no sentence in the response has cosine similarity ≥ 0.60 with the fact → flag as omission
Classify omission severity based on importance_weight:
≥ 0.90 → CRITICAL omission
≥ 0.80 → HIGH omission
≥ 0.70 → MEDIUM omission

12.3 Output¶

CRP-Safety-Omissions: CRITICAL:0, HIGH:1, MEDIUM:2

13. Stage 11: PII Detection¶

13.1 Purpose¶

Detect personal data (as defined under GDPR Art. 4(1)) in both the request prompt and the LLM response.

13.2 Detection Categories¶

Category	Pattern	Examples
Direct identifiers	Named entity recognition	Full names, email addresses, phone numbers
Quasi-identifiers	Pattern matching + context	Dates of birth, addresses, employee IDs
Sensitive data	Classifier	Health data, political opinions, ethnic origin
Financial data	Pattern matching	Credit card numbers, bank accounts, salaries

13.3 Output¶

CRP-Compliance-GDPR-PII: true

When true, the gateway also checks whether CRP-Context-Cache: no-store is set. If not, a compliance warning is emitted.

14. Stage 12: Regulatory Classification¶

14.1 Purpose¶

Classify the AI call against regulatory frameworks based on the registered AI system type, deployment domain, and the DPE analysis results.

14.2 EU AI Act Classification¶

Based on the system's registered risk category (set during gateway configuration):

Registered Domain	Default Class	Override Conditions
Biometric identification	UNACCEPTABLE	-
Employment/recruitment	HIGH	-
Education access	HIGH	-
Critical infrastructure	HIGH	-
General information	LIMITED	Upgrades to HIGH if PII detected in decisions
Creative/entertainment	MINIMAL	-

14.3 Output¶

CRP-Compliance-EU-AI-Act: LIMITED
CRP-Compliance-NIST-Tier: TIER-2
CRP-Compliance-ISO-42001: A.6.1.2, A.9.4
CRP-Compliance-Controls-Met: 33/35

15. Stage 13: Report Generation¶

15.1 Purpose¶

Generate the complete DPE provenance report and the HMAC audit record for this window.

15.2 Report Contents¶

DPE_Report {
  session_id:               string
  window_id:                string
  window_number:            integer
  timestamp:                ISO 8601

  // Provenance
  claims_total:             integer
  claims_by_type:           map<string, integer>
  attribution_dominant:     string
  grounding_pct:            float
  fabrication_count:        integer
  distortion_count:         integer
  distortion_types:         string[]
  contradiction_count:      integer
  contradiction_scope:      string
  omission_summary:         map<string, integer>
  fidelity_score:           float
  entailment_score:         float

  // Quality (NEW v3.0)
  repetition_level:         string
  repetition_overlap:       float
  completeness_score:       float
  uncovered_sub_queries:    string[]
  flow_score:               float
  flow_remediation:         string | null
  cross_window_contradictions: Contradiction[]

  // Risk
  composite_score:          float
  amplifiers_applied:       string[]
  risk_level:               string

  // Compliance
  eu_ai_act_class:          string
  nist_tier:                string
  gdpr_pii_detected:        boolean
  iso_42001_controls:       string[]

  // Chain
  hmac:                     string
  chain_integrity:          string

  // Hashes for verification
  response_content_hash:    string
  report_hash:              string
}

15.3 HMAC Generation¶

See CRP-SPEC-004 §9 and CRP-SPEC-011 for the complete HMAC chain algorithm.

15.4 Report Storage¶

The report is: 1. Stored in the gateway's audit trail database 2. Streamed to CRP Comply (if configured) 3. Referenced via CRP-Provenance-Report-URI header 4. Included in the HMAC chain via its hash

16. Composite Risk Scoring Algorithm¶

16.1 Reference Algorithm (Non-Normative)¶

The following pseudocode illustrates the structure of the composite scoring algorithm. The numeric weight values (w₁..w₄) and amplifier factors are calibration parameters of the CRP reference implementation and are not normative - conformant implementations MAY tune them per CRP-SPEC-014 §3.

# Stage outputs (0.0 = best, 1.0 = worst for signal inputs)
attribution_signal  = 1.0 - grounding_pct
fidelity_signal     = 1.0 - fidelity_score
entailment_signal   = 1.0 - entailment_score
specificity_signal  = unverifiable_count / max(1, factual_count)

# Weighted linear combination (weights configured per deployment, must sum to 1.0)
base_composite = (
    attribution_signal  * w_attribution
  + fidelity_signal     * w_fidelity
  + entailment_signal   * w_entailment
  + specificity_signal  * w_specificity
)

# Apply regulatory amplifiers (§17) - factors are deployment-configurable
amplified = base_composite
for amplifier in applicable_amplifiers:
    amplified *= amplifier.factor

# Quality penalties (CRP v3.0+) - factors are deployment-configurable
if cross_window_contradiction_count > 0:
    amplified *= q_cross_window
if repetition_level == "SEVERE":
    amplified *= q_repetition

composite = min(amplified, 1.0)

# Classify (thresholds are normative - see §7.4)
if   composite >= 0.70: risk = "CRITICAL"
elif composite >= 0.45: risk = "HIGH"
elif composite >= 0.20: risk = "MEDIUM"
else:                    risk = "LOW"

16.2 Signal Ordering Rationale¶

The relative ordering of signal importance reflects empirical analysis of LLM failure modes against the CRP benchmark corpus:

Attribution is the strongest predictor of hallucination - an ungrounded response is the primary failure mode and receives the highest weight.
Fidelity ranks second - distortion (misrepresenting real sources) is more dangerous than mere unverifiability.
Entailment ranks equal-second - direct logical contradiction between context and response is a high-confidence failure signal.
Specificity receives the lowest weight but catches a distinctive failure class (fabricated citations and invented specifics) that the other signals can miss.

16.3 Weight Configuration (Normative)¶

Conformant CRP gateways:

MUST document the weight vector and amplifier factors in use.
MUST ensure w_attribution + w_fidelity + w_entailment + w_specificity == 1.0.
MUST publish the weight configuration in the conformance report (CRP-SPEC-014 §3).
MUST preserve the relative ordering of signal importance as described in §16.2.

The exact reference values used by the CRP reference implementation are part of the calibrated configuration distributed with the reference implementation; they are not republished in this specification to discourage premature copy-paste deployment without site-specific calibration.

17. Regulatory Amplifiers¶

Amplifiers increase the composite score when the regulatory context raises the stakes of a given risk level. The direction and rationale below are normative; the exact multiplier values are calibration parameters of the reference implementation and are not normative.

Condition	Direction	Justification
`CRP-Compliance-GDPR-PII: true`	Substantial uplift	PII in AI outputs raises GDPR Art. 5(1)(d) accuracy obligation
EU AI Act HIGH-risk domain	Significant uplift	HIGH-risk systems have stricter Art. 9 risk management requirements
Financial or medical domain	Moderate uplift	Sector-specific accuracy regulations (MiFID II, MDR)
`CRP-Agent-Loop-Depth` above threshold	Moderate uplift	Deeper agent chains compound error risk
Cross-window contradiction	Moderate uplift	Contradictions across windows indicate systemic generation failure
`CRP-Quality-Repetition: SEVERE`	Light uplift	Severe repetition indicates model degradation

Conformant implementations MUST document the specific amplifier factors in use in the conformance report and MUST preserve the relative ordering shown above.

18. Response Quality Assurance (RQA)¶

18.1 Overview¶

The RQA subsystem (Stages 6–9) ensures that multi-window CRP responses meet production quality standards. It addresses the four failure modes unique to continuation:

Failure Mode	Stage	Signal	Remediation
Cross-window contradiction	6	NLI contradiction score	Re-dispatch with context correction
Content repetition	7	N-gram + semantic overlap	Re-dispatch with anti-repetition prompt
Incomplete answer	8	Sub-query coverage	Auto-continuation to uncovered topics
Incoherent flow	9	Opening coherence + transitions	Prompt augmentation or stitch insertion

18.2 RQA as a Composite Quality Score¶

A new composite quality score (separate from the safety risk score):

quality_score = (
    (1.0 - repetition_ratio)     × 0.25 +
    completeness_score            × 0.35 +
    flow_score                    × 0.25 +
    coherence_score               × 0.15    // 1.0 - contradiction_ratio
)

18.3 Quality Score Header¶

CRP-Quality-Score: 0.91

This is a new response header - distinct from CRP-Safety-Hallucination-Risk. Safety measures truthfulness. Quality measures usefulness. Both matter.

18.4 Quality Tier Interaction¶

The quality score can downgrade the CRP-Context-Quality-Tier:

Quality Score	Maximum Quality Tier
≥ 0.85	S (no restriction)
0.70 – 0.84	A
0.50 – 0.69	B
0.30 – 0.49	C
< 0.30	D

If the envelope quality tier is A but the RQA quality score is 0.62, the emitted tier is downgraded to B.

18.5 The Stitching Pipeline (Complete Flow)¶

For a multi-window response, the complete quality pipeline is:

Window N response received from LLM
     │
     ▼
[Stage 6] Cross-window coherence check
     │── Contradiction found? → Flag, apply amplifier, consider re-dispatch
     ▼
[Stage 7] Repetition detection
     │── SEVERE repetition? → Re-dispatch with anti-repetition prompt
     ▼
[Stage 8] Completeness verification
     │── PARTIAL/INCOMPLETE? → Auto-continue to uncovered sub-queries
     ▼
[Stage 9] Flow analysis
     │── Low flow score? → Insert stitch OR re-dispatch with flow prompt
     ▼
Quality Score computed → Quality Tier adjusted
     │
     ▼
Response released to client (with quality + safety headers)

19. Re-Dispatch Protocol¶

19.1 When Re-Dispatch Is Triggered¶

Re-dispatch occurs when: 1. CRP-Safety-Policy contains upgrade-on-risk <strategy> AND risk level exceeds threshold 2. SEVERE repetition detected (Stage 7) 3. CRITICAL cross-window contradiction detected (Stage 6) 4. Flow score < 0.30 and prompt augmentation is configured (Stage 9)

19.2 Re-Dispatch Parameters¶

Parameter	Adjustment
Temperature	Reduced to 0.2 (from default 0.7)
Strategy	Upgraded to `reflexive` (or strategy named in `upgrade-on-risk`)
System prompt	Augmented with specific remediation instructions
Seed	Changed (different from original dispatch) to avoid regenerating same output
Max re-dispatches	2 per window (prevent infinite loops)

19.3 Re-Dispatch Accounting¶

Re-dispatches: - Count as additional LLM calls (token cost) - Are billed at CRP Gateway usage rates - Are logged as RE_DISPATCH events in the audit trail - Do NOT decrement the safety budget (they are remediation, not new risk) - The final response's DPE analysis is the one that determines headers

20. Header Outputs¶

Complete list of headers populated by the DPE pipeline:

Stage	Header
1	`CRP-Provenance-Claim-Count`
2	`CRP-Safety-Attribution`, `CRP-Safety-Grounding-Pct`, `CRP-Provenance-Attribution-Score`
3a	`CRP-Safety-Fabrications`
3b	`CRP-Safety-Distortions`
3c/6	`CRP-Safety-Contradictions`
4	`CRP-Safety-Entailment-Score`, `CRP-Provenance-Fidelity-Score`
5	`CRP-Safety-Hallucination-Risk`, `CRP-Safety-Hallucination-Score`
7	`CRP-Quality-Repetition` ★ NEW
8	`CRP-Quality-Completeness` ★ NEW
9	`CRP-Quality-Flow` ★ NEW
RQA	`CRP-Quality-Score` ★ NEW
10	`CRP-Safety-Omissions`
11	`CRP-Compliance-GDPR-PII`
12	`CRP-Compliance-EU-AI-Act`, `CRP-Compliance-NIST-Tier`, `CRP-Compliance-ISO-42001`, `CRP-Compliance-Controls-Met`
13	`CRP-Provenance-HMAC`, `CRP-Provenance-Chain-Integrity`, `CRP-Provenance-Report-URI`

21. Security Considerations¶

21.1 DPE as a Trust Boundary¶

The DPE runs within the CRP gateway and its outputs (headers) are authoritative. If the DPE is compromised, all safety signals become unreliable. Mitigations: - DPE code MUST be integrity-checked on gateway startup - DPE outputs MUST be included in the HMAC chain (report hash is part of the HMAC input) - DPE models (NLI cross-encoder, NER model) MUST be pinned to specific versions and hash-verified

21.2 Adversarial LLM Responses¶

An LLM that has been prompt-injected may attempt to produce output that evades DPE detection. Mitigations: - NLI entailment scoring is resistant to stylistic manipulation - it measures logical relationship, not surface features - Fabrication detection uses entity extraction + envelope search, not pattern matching - harder to evade - The DPE pipeline runs on the complete response, not on a model-selected excerpt

21.3 DPE Scoring Weights as Proprietary Information¶

The default scoring weights are published in this specification. However, production CRP deployments MAY use custom weights calibrated to their specific domain. Custom weights SHOULD be treated as proprietary and SHOULD NOT be exposed in headers or API responses.

22. References¶

Normative References¶

CRP-SPEC-001 - Core Protocol Specification
CRP-SPEC-002 - Header Field Specification
CRP-SPEC-003 - Context Envelope & Packing
CRP-SPEC-004 - Window Continuation & DAG
CRP-SPEC-006 - Safety Policy Directive Language
CRP-SPEC-011 - Audit Trail & HMAC Chain

Informative References¶

Williams, A., Nangia, N., & Bowman, S.R. (2018). "A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference" (MNLI)
He, P., Gao, J., & Chen, W. (2021). "DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing"
EU Regulation 2024/1689 (EU AI Act)
GDPR Art. 4(1) - Definition of personal data
NIST AI RMF 1.0 - MEASURE function