Skip to content

Multi-Turn Conversations

CRP accumulates knowledge across multiple dispatch calls within a session. Each turn's extracted facts automatically appear in subsequent envelopes.

How It Works

sequenceDiagram
    participant You
    participant CRP
    participant LLM

    You->>CRP: Turn 1: "Explain Python GIL"
    CRP->>LLM: Envelope (task only)
    LLM-->>CRP: Output about GIL
    CRP->>CRP: Extract 12 facts about GIL

    You->>CRP: Turn 2: "Now explain asyncio"
    CRP->>LLM: Envelope (task + 12 GIL facts)
    LLM-->>CRP: Output about asyncio (informed by GIL knowledge)
    CRP->>CRP: Extract 18 facts total

    You->>CRP: Turn 3: "Compare threading vs asyncio"
    CRP->>LLM: Envelope (task + 18 facts from turns 1-2)
    LLM-->>CRP: Comparison (grounded in prior knowledge)

Each turn builds on the last. The model doesn't need to re-explain the GIL when comparing threading approaches — those facts are already in the envelope.

Example

import crp

session = crp.init(provider="ollama", model="qwen3-4b")

# Turn 1: Foundation
result1 = session.dispatch(task="Explain the Python GIL in detail")
print(f"Turn 1: {result1.facts_extracted} facts extracted")
# Turn 1: 12 facts extracted

# Turn 2: Build on Turn 1
result2 = session.dispatch(task="Now explain Python's asyncio library")
print(f"Turn 2: {result2.facts_extracted} facts extracted")
print(f"Envelope saturation: {result2.envelope_saturation:.1%}")
# Turn 2: 18 facts extracted
# Envelope saturation: 34.2%

# Turn 3: Leverage all prior knowledge
result3 = session.dispatch(
    task="Compare threading vs asyncio for I/O-bound tasks"
)
print(f"Turn 3: {result3.facts_extracted} facts extracted")
print(f"Quality: {result3.quality_tier}")
# Turn 3: 24 facts extracted
# Quality: A

# Check session state
status = session.session_status()
print(f"Total windows: {status.windows_completed}")
print(f"Total tokens: {status.total_input_tokens + status.total_output_tokens}")
print(f"Facts in warm state: {status.facts_in_warm_state}")

Fact Accumulation

Facts accumulate in the warm state (Tier 2 memory):

Turn New Facts Total Facts Envelope Saturation
1 12 12 15%
2 8 20 34%
3 6 26 48%
4 4 30 55%

As saturation increases, the envelope packing algorithm becomes more selective — only the most relevant facts make it into the envelope.

Envelope Packing

Not all accumulated facts fit in every envelope. CRP's packing algorithm:

  1. Score each fact by relevance to the current task
  2. Sort by score (highest first)
  3. Pack until the envelope token budget is reached
  4. Reserve space for system prompt, task description, and structural markers

This means Turn 5 might include facts from Turn 1 if they're relevant, and skip facts from Turn 3 if they're not.

Preview Before Dispatch

Use preview_envelope() to see what will be included:

preview = session.preview_envelope(
    task="Compare threading vs asyncio"
)
print(f"Total tokens: {preview.total_tokens}")
print(f"Facts included: {preview.facts_included}")
print(f"Saturation: {preview.saturation:.1%}")

Multi-Turn with Continuation

Continuation works within each turn. A multi-turn session with continuation might look like:

Turn Task Windows Facts Added
1 "Explain microservices" 4 45
2 "Now cover service mesh" 3 32
3 "Compare Istio vs Linkerd" 2 18

Turn 3's envelope includes the most relevant facts from all 95 accumulated facts across 9 total windows.

Best Practices

Topic progression

Structure turns to build on each other. "Explain X" → "Now explain Y" → "Compare X and Y" leverages fact accumulation maximally.

Check saturation

If envelope_saturation exceeds 80%, the envelope is very full. Consider starting a new session or ingesting a summary instead.

Session limits

Sessions have a configurable maximum lifetime and fact count. Check session_status() periodically to monitor resource usage.