Multi-Turn Conversations¶
CRP accumulates knowledge across multiple dispatch calls within a session. Each turn's extracted facts automatically appear in subsequent envelopes.
How It Works¶
sequenceDiagram
participant You
participant CRP
participant LLM
You->>CRP: Turn 1: "Explain Python GIL"
CRP->>LLM: Envelope (task only)
LLM-->>CRP: Output about GIL
CRP->>CRP: Extract 12 facts about GIL
You->>CRP: Turn 2: "Now explain asyncio"
CRP->>LLM: Envelope (task + 12 GIL facts)
LLM-->>CRP: Output about asyncio (informed by GIL knowledge)
CRP->>CRP: Extract 18 facts total
You->>CRP: Turn 3: "Compare threading vs asyncio"
CRP->>LLM: Envelope (task + 18 facts from turns 1-2)
LLM-->>CRP: Comparison (grounded in prior knowledge)
Each turn builds on the last. The model doesn't need to re-explain the GIL when comparing threading approaches — those facts are already in the envelope.
Example¶
import crp
session = crp.init(provider="ollama", model="qwen3-4b")
# Turn 1: Foundation
result1 = session.dispatch(task="Explain the Python GIL in detail")
print(f"Turn 1: {result1.facts_extracted} facts extracted")
# Turn 1: 12 facts extracted
# Turn 2: Build on Turn 1
result2 = session.dispatch(task="Now explain Python's asyncio library")
print(f"Turn 2: {result2.facts_extracted} facts extracted")
print(f"Envelope saturation: {result2.envelope_saturation:.1%}")
# Turn 2: 18 facts extracted
# Envelope saturation: 34.2%
# Turn 3: Leverage all prior knowledge
result3 = session.dispatch(
task="Compare threading vs asyncio for I/O-bound tasks"
)
print(f"Turn 3: {result3.facts_extracted} facts extracted")
print(f"Quality: {result3.quality_tier}")
# Turn 3: 24 facts extracted
# Quality: A
# Check session state
status = session.session_status()
print(f"Total windows: {status.windows_completed}")
print(f"Total tokens: {status.total_input_tokens + status.total_output_tokens}")
print(f"Facts in warm state: {status.facts_in_warm_state}")
Fact Accumulation¶
Facts accumulate in the warm state (Tier 2 memory):
| Turn | New Facts | Total Facts | Envelope Saturation |
|---|---|---|---|
| 1 | 12 | 12 | 15% |
| 2 | 8 | 20 | 34% |
| 3 | 6 | 26 | 48% |
| 4 | 4 | 30 | 55% |
As saturation increases, the envelope packing algorithm becomes more selective — only the most relevant facts make it into the envelope.
Envelope Packing¶
Not all accumulated facts fit in every envelope. CRP's packing algorithm:
- Score each fact by relevance to the current task
- Sort by score (highest first)
- Pack until the envelope token budget is reached
- Reserve space for system prompt, task description, and structural markers
This means Turn 5 might include facts from Turn 1 if they're relevant, and skip facts from Turn 3 if they're not.
Preview Before Dispatch¶
Use preview_envelope() to see what will be included:
preview = session.preview_envelope(
task="Compare threading vs asyncio"
)
print(f"Total tokens: {preview.total_tokens}")
print(f"Facts included: {preview.facts_included}")
print(f"Saturation: {preview.saturation:.1%}")
Multi-Turn with Continuation¶
Continuation works within each turn. A multi-turn session with continuation might look like:
| Turn | Task | Windows | Facts Added |
|---|---|---|---|
| 1 | "Explain microservices" | 4 | 45 |
| 2 | "Now cover service mesh" | 3 | 32 |
| 3 | "Compare Istio vs Linkerd" | 2 | 18 |
Turn 3's envelope includes the most relevant facts from all 95 accumulated facts across 9 total windows.
Best Practices¶
Topic progression
Structure turns to build on each other. "Explain X" → "Now explain Y" → "Compare X and Y" leverages fact accumulation maximally.
Check saturation
If envelope_saturation exceeds 80%, the envelope is very full. Consider
starting a new session or ingesting a summary instead.
Session limits
Sessions have a configurable maximum lifetime and fact count. Check
session_status() periodically to monitor resource usage.