Context Envelope¶
The context envelope is CRP's core innovation — a dynamically packed payload that fills all available space in the LLM's context window with the most relevant facts.
Maximum Context Saturation¶
$$E_{\max} = C - S - T - G$$
| Symbol | Meaning | Example (128K) |
|---|---|---|
| $C$ | Context window size | 131,072 |
| $S$ | System prompt tokens | ~500 |
| $T$ | Task input tokens | ~8,756 |
| $G$ | Generation reserve | 16,384 |
| $E_{\max}$ | Envelope capacity | 105,816 |
$G$ is automatically determined: user's max_output_tokens → provider's reported
max → min(C // 4, 16384).
In practice
CRP achieves 0.939–1.021 saturation (mean 0.994) — virtually every available token is used for relevant context.
Envelope Sections¶
The envelope is structured with 11 priority-ordered sections. Higher-priority sections survive when space is limited:
| Priority | Section | Tokens | Purpose |
|---|---|---|---|
| 1 | Critical State | 100–500 | GOAL, PHASE, BLOCKER, CONSTRAINT, WINDOW |
| 2 | LLM Synthesis | Adaptive | LLM's own curated understanding |
| 3 | Task Brief | Varies | What to do + output format |
| 4 | Discoveries | Bulk | Atomic facts with graph edges |
| 5 | Source Passages | Variable | Verbatim text for high-relevance facts |
| 6 | Decisions & Plan | Variable | Reasoning trail with justifications |
| 7 | Error Log | Small | What failed and why |
| 8 | Tool History | Small | Compact execution summaries |
| 9 | Expanded Context | Overflow | Full-fidelity data from warm state |
| 10 | CKF Retrievals | Variable | Cross-session knowledge |
| 11 | Reasoning Scaffold | Small | Step-by-step templates (weak models) |
Fact Selection Algorithm¶
CRP uses a 3-phase pipeline to select which facts go into the envelope:
Phase 1: Multi-Aspect Task Decomposition¶
The task is broken into noun phrases / aspects. A fact matching any aspect scores high:
$$\text{score}(f) = \max_{a \in \text{aspects}} \cos\bigl(\text{embed}(f),\; \text{embed}(a)\bigr)$$
Phase 2: Bi-Encoder Fast Scoring¶
All facts scored using all-MiniLM-L6-v2 embeddings. For >1,000 facts, an
HNSW ANN index provides $O(\log N)$ retrieval.
Composite score:
$$\text{final}(f) = \text{sim}(f) \times \text{recency}(f) \times \text{novelty}(f) + \text{dep_bonus}(f)$$
| Factor | Formula | Range |
|---|---|---|
| Recency | $e^{-0.1 \times \text{age_in_windows}}$ | 0 → 1 |
| Novelty | Unseen: 1.5×, <3 uses: 1.0×, 3+: 0.5× | 0.5 → 1.5 |
| Dependency | Graph-connected facts inherit relevance | 0 → 0.5 |
Phase 3: Cross-Encoder Reranking¶
Top 200 candidates re-scored with ms-marco-MiniLM-L6-v2:
$$\text{blended} = 0.6 \times \text{CE_score} + 0.4 \times \text{BE_score}$$
Cache hit rate: 50–80% in continuation chains (saves 200–320 ms/window).
Packing Strategy¶
After scoring, facts are packed using greedy bin-packing with:
- Dependency-aware graph pulling — up to 2 hops of connected facts
- Bookend strategy — top 3 facts duplicated at envelope end (counters "lost in the middle" attention bias)
- Progressive compression — truncation → summarization → tabular → reference replacement
Continuation Envelopes¶
When output is truncated, CRP builds a continuation envelope containing:
| Component | Purpose |
|---|---|
| Extracted facts | From the truncated output |
| Structural state | Open blocks, list position, section headers |
| Task gap | Missing items from original task |
| Style anchor | Last natural paragraph for voice consistency |
| Voice profile | Sentence length, vocabulary, tone markers |
| Document map | Running TOC with section completion status |
Note
Continuation envelopes use extraction results, not raw text overlap. This is key to CRP's quality preservation across windows.
Preview¶
Before dispatching, you can preview what the envelope will look like:
preview = client.preview_envelope(
system_prompt="You are a technical writer.",
task_input="Explain Kubernetes networking.",
)
print(f"Total tokens: {preview.total_tokens}")
print(f"Envelope tokens: {preview.envelope_tokens}")
print(f"Generation reserve: {preview.generation_reserve}")
print(f"Facts included: {preview.facts_included}")
print(f"Facts available: {preview.facts_available}")
print(f"Saturation: {preview.saturation:.1%}")