Skip to content

HTTP Sidecar

CRP can run as a lightweight REST API server via crp serve, enabling cross-process context sharing without embedding the Python library.

Quick Start

# Start the sidecar (loopback only by default)
crp serve

# Or specify host/port
crp serve --host 127.0.0.1 --port 8900

# With authentication (required for non-loopback)
crp serve --bind-all --auth-token "your-secret-token"

Architecture

The sidecar uses Python's stdlib http.server — no external dependencies. It exposes the full CRP API over REST:

Your App (any language)
    ↓ HTTP
CRP Sidecar (127.0.0.1:8900)
    ↓ Python
CRP Library → LLM Provider

This lets you use CRP from Node.js, Go, Rust, or any language with HTTP support.

Endpoints

Session Lifecycle

Method Path Description
POST /sessions Create a new session
GET /sessions List active sessions
GET /sessions/:id/status Get session status
POST /sessions/:id/close Close and flush to CKF

Dispatch

Method Path Description
POST /sessions/:id/dispatch Standard dispatch
POST /sessions/:id/dispatch/tools Tool-augmented dispatch
POST /sessions/:id/dispatch/reflexive Reflexive dispatch
POST /sessions/:id/dispatch/progressive Progressive dispatch
POST /sessions/:id/dispatch/stream-augmented Stream-augmented
POST /sessions/:id/dispatch/agentic Agentic dispatch

Knowledge

Method Path Description
POST /sessions/:id/ingest Ingest external data
GET /sessions/:id/facts List extracted facts
POST /sessions/:id/facts/share Share facts between sessions
POST /sessions/:id/facts/feedback Provide fact feedback
GET /sessions/:id/envelope Preview current envelope

Provider Management

Method Path Description
POST /sessions/:id/providers Configure provider

Operations

Method Path Description
GET /health Health check
GET /ready Kubernetes readiness probe
GET /metrics Prometheus-format metrics
POST /sessions/:id/estimate Estimate session cost

Example: Create Session and Dispatch

# Create a session
curl -X POST http://localhost:8900/sessions \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "ollama",
    "model": "qwen3-4b"
  }'
# Response: {"session_id": "abc-123", "protocol_version": "2.0.0", ...}

# Dispatch a task
curl -X POST http://localhost:8900/sessions/abc-123/dispatch \
  -H "Content-Type: application/json" \
  -d '{
    "task": "Explain the CAP theorem",
    "expected_output_type": "markdown"
  }'
# Response: {"output": "...", "quality_tier": "A", "facts_extracted": 12, ...}

Security

The sidecar includes multiple security layers:

Network Binding

  • Default: Loopback only (127.0.0.1) — only local processes can connect
  • --bind-all: Binds to 0.0.0.0 — REQUIRES --auth-token

Authentication

When --auth-token is set, all requests must include:

Authorization: Bearer your-secret-token

Token comparison uses secrets.compare_digest() (constant-time) to prevent timing attacks.

Rate Limiting

Limit Default
General requests 120 per 60 seconds per IP
Dispatch requests 30 per 60 seconds per IP
Max tracked IPs 10,000 (evicts oldest 2,000)

Request Limits

Limit Default
Max body size 10 MB
Max sessions 64

Error Sanitization

Error responses are sanitized — stack traces and file paths are stripped to prevent information leakage.

Response Headers

X-Content-Type-Options: nosniff
Cache-Control: no-store

Configuration

All sidecar options:

crp serve [OPTIONS]

Options:
  --host TEXT        Bind address (default: 127.0.0.1)
  --port INT         Port (default: 8900)
  --auth-token TEXT  Bearer token for authentication
  --bind-all         Bind to 0.0.0.0 (requires --auth-token)
  --max-sessions INT Maximum concurrent sessions (default: 64)
  --max-body-mb INT  Maximum request body in MB (default: 10)
  --rate-limit INT   Requests per minute per IP (default: 120)

OpenAPI Schema

The full OpenAPI 3.1.0 specification is available at schemas/openapi.json in the repository. Import it into Postman, Swagger UI, or any OpenAPI-compatible tool.