Skip to content

Running Tests

Prerequisites

# Install CRP with dev dependencies
pip install -e ".[dev]"

# Verify pytest is available
python -m pytest --version

Dev dependencies include: pytest>=7.4, pytest-asyncio>=0.21, pytest-cov>=4.1.

Test Configuration

From pyproject.toml:

[tool.pytest.ini_options]
testpaths = ["tests"]
asyncio_mode = "auto"

Running Tests

# Run a specific test file
python -m pytest tests/test_smoke.py -v --tb=short

# Run a specific test class
python -m pytest tests/test_phase1.py::TestErrorCodes -v

# Run a specific test function
python -m pytest tests/test_phase2.py::TestExtractionPipeline::test_stage1_regex -v

# Run with coverage
python -m pytest tests/test_phase3.py --cov=crp --cov-report=term

Never run all tests in parallel

Running the full suite simultaneously will max out CPU and memory. Always run one file at a time.

Shared Fixtures

Defined in tests/conftest.py:

Fixture Type Purpose
sample_task_intent TaskIntent Minimal task with system prompt and input
sample_system_prompt str "You are a helpful assistant."
sample_task_input str "What is CRP?"

Individual test files define additional fixtures inline for their specific needs.


Test Categories

Smoke Tests

File: test_smoke.py (6 tests)

Fast sanity checks that CRP imports correctly:

  • Package version is a valid string
  • CRPError exists and is an Exception
  • TaskIntent has correct default values
  • Core modules are importable
python -m pytest tests/test_smoke.py -v
# ~2 seconds

Unit Tests: Core Phases

9 files covering all SDK phases, 636 tests total:

File Tests Phase What It Tests
test_phase1.py 102 Errors & Config Error codes, session config, window sizing, orchestrator lifecycle, provider registration
test_phase2.py 96 Extraction All 6 extraction stages, quality gate, contradiction detection, complexity analysis
test_phase3.py 59 Envelope All 6 envelope building phases, token budgeting, fact packing, saturation
test_phase4.py 65 State & CKF StateFact lifecycle, WarmStateStore, snapshots, compaction, cold storage, graph serialization
test_phase5.py 58 Continuation Wall detection, gap analysis, stitch algorithm, echo detection, voice profiles, termination
test_phase6.py 67 Security Session binding, fact HMAC integrity, encryption, input validation, injection detection, RBAC, rate limiting
test_phase7.py 91 Advanced Auto-ingest, scale mode, hierarchical/parallel strategies, curator, meta-learning, batch, idempotency, cost model
test_phase8.py 38 CLI & Deploy CLI commands, startup sequence, event emitter, deployment configuration
test_phase9.py 60 Observability Metrics collection, audit trail, quality scores, telemetry, overhead measurement
# Run Phase 2 (extraction)
python -m pytest tests/test_phase2.py -v --tb=short

# Run just the envelope tests
python -m pytest tests/test_phase3.py -v --tb=short

Unit Tests: Specialized

14 files with deep module-level coverage, 634 tests total:

File Tests Module
test_adaptive_allocator.py 57 Resource allocator, hardware detection, EWMA, model unloading
test_adversarial_provenance.py 41 Edge cases: empty strings, unicode attacks, HTML injection, null bytes
test_agentic.py 84 Agentic architecture (§22): facilitator, task analysis, strategy routing
test_ckf_gate.py 11 CKF gate threshold, budget, retriever
test_compliance_security.py 75 Privacy, consent, audit trail, GDPR (§7.12–§7.15)
test_compliance_wiring.py 44 Audit entries for session/dispatch/ingest, PII scanning, lineage
test_decision_provenance.py 40 Envelope audit, LLM call audit, fact extraction audit
test_decision_provenance_engine.py 79 DPE: claim detection, attribution, provenance chains, reports
test_entailment_risk.py 62 NLI verification, hallucination risk scoring, heuristic fallback
test_fidelity_verification.py 63 Distortion, omission, fabrication, contradiction detection
test_relay_strategies.py 61 Reflexive, progressive, stream-augmented strategies (§21)
test_resource_manager.py 38 Resource lifecycle, meta-learning calibration, WindowMetrics
test_security_modules.py 45 Audit trail, privacy, injection, RBAC, encryption, integrity modules
test_tool_relay.py 34 Tool-mediated relay (§20), pull architecture, tool loop, fallback
# Run provenance engine tests
python -m pytest tests/test_decision_provenance_engine.py -v

# Run security module tests
python -m pytest tests/test_security_modules.py -v

Integration Tests

File: test_integration.py (57 tests)

Cross-module end-to-end tests using CustomProvider with controlled generate_fn. No external APIs needed — everything runs locally with mock responses.

python -m pytest tests/test_integration.py -v --tb=short

Production Hardening

File: test_production_hardening.py (40 tests)

Tests for production reliability:

  • Circuit breaker behavior
  • Configuration validation
  • Retry logic with backoff
  • Session cleanup on crash
  • Structured logging format
  • Key rotation
python -m pytest tests/test_production_hardening.py -v

Performance Benchmarks

File: test_benchmarks.py (12 tests)

Performance regression tests with specific targets:

Benchmark Target
Cold session init < 200ms
Warm session init < 50ms
Dispatch overhead < 100ms
Envelope assembly < 50ms
Ingest throughput > 100 facts/sec
Cache hit < 1ms
Event emission < 5ms
Metrics export < 10ms
python -m pytest tests/test_benchmarks.py -v

Live Tests

4 files, 52 tests — require a running LLM (LM Studio or Ollama):

File Tests What It Verifies
test_gap_fixes_live.py 30 Gap fixes A–E against real LLM
test_live_comprehensive.py 11 Full protocol verification
test_live_full_capture.py 11 Output capture and analysis
test_live_long_generation.py Long generation (standalone script)
# Start LM Studio first, then:
python -m pytest tests/test_gap_fixes_live.py -v --tb=short

LM Studio connection

Live tests connect to LM Studio at http://192.168.0.6:1234 by default. Update the connection URL in the test file if your setup differs.

Killer Test Suite

Standalone adversarial/stress test scripts in tests/killer_test/:

  • crp_killer_test.py — Comprehensive stress test
  • debug_gap.py, debug_gap2.py — Gap debugging utilities

Run directly with Python:

python tests/killer_test/crp_killer_test.py

Tips

Use --tb=short for cleaner output

python -m pytest tests/test_phase2.py -v --tb=short

Run a specific test by keyword

python -m pytest tests/ -k "extraction" -v

Check coverage for a specific module

python -m pytest tests/test_phase6.py --cov=crp.security --cov-report=term-missing