HTTP Sidecar¶

CRP can run as a lightweight REST API server via crp serve, enabling cross-process context sharing without embedding the Python library.

Quick Start¶

# Start the sidecar (loopback only by default)
crp serve

# Or specify host/port
crp serve --host 127.0.0.1 --port 8900

# With authentication (required for non-loopback)
crp serve --bind-all --auth-token "your-secret-token"

Architecture¶

The sidecar uses Python's stdlib http.server — no external dependencies. It exposes the full CRP API over REST:

Your App (any language)
    ↓ HTTP
CRP Sidecar (127.0.0.1:8900)
    ↓ Python
CRP Library → LLM Provider

This lets you use CRP from Node.js, Go, Rust, or any language with HTTP support.

Endpoints¶

Session Lifecycle¶

Method	Path	Description
`POST`	`/sessions`	Create a new session
`GET`	`/sessions`	List active sessions
`GET`	`/sessions/:id/status`	Get session status
`POST`	`/sessions/:id/close`	Close and flush to CKF

Dispatch¶

Method	Path	Description
`POST`	`/sessions/:id/dispatch`	Standard dispatch
`POST`	`/sessions/:id/dispatch/tools`	Tool-augmented dispatch
`POST`	`/sessions/:id/dispatch/reflexive`	Reflexive dispatch
`POST`	`/sessions/:id/dispatch/progressive`	Progressive dispatch
`POST`	`/sessions/:id/dispatch/stream-augmented`	Stream-augmented
`POST`	`/sessions/:id/dispatch/agentic`	Agentic dispatch

Knowledge¶

Method	Path	Description
`POST`	`/sessions/:id/ingest`	Ingest external data
`GET`	`/sessions/:id/facts`	List extracted facts
`POST`	`/sessions/:id/facts/share`	Share facts between sessions
`POST`	`/sessions/:id/facts/feedback`	Provide fact feedback
`GET`	`/sessions/:id/envelope`	Preview current envelope

Provider Management¶

Method	Path	Description
`POST`	`/sessions/:id/providers`	Configure provider

Operations¶

Method	Path	Description
`GET`	`/health`	Health check
`GET`	`/ready`	Kubernetes readiness probe
`GET`	`/metrics`	Prometheus-format metrics
`POST`	`/sessions/:id/estimate`	Estimate session cost

Example: Create Session and Dispatch¶

# Create a session
curl -X POST http://localhost:8900/sessions \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "ollama",
    "model": "qwen3-4b"
  }'
# Response: {"session_id": "abc-123", "protocol_version": "2.0.0", ...}

# Dispatch a task
curl -X POST http://localhost:8900/sessions/abc-123/dispatch \
  -H "Content-Type: application/json" \
  -d '{
    "task": "Explain the CAP theorem",
    "expected_output_type": "markdown"
  }'
# Response: {"output": "...", "quality_tier": "A", "facts_extracted": 12, ...}

Security¶

The sidecar includes multiple security layers:

Network Binding¶

Default: Loopback only (127.0.0.1) — only local processes can connect
--bind-all: Binds to 0.0.0.0 — REQUIRES --auth-token

Authentication¶

When --auth-token is set, all requests must include:

Authorization: Bearer your-secret-token

Token comparison uses secrets.compare_digest() (constant-time) to prevent timing attacks.

Rate Limiting¶

Limit	Default
General requests	120 per 60 seconds per IP
Dispatch requests	30 per 60 seconds per IP
Max tracked IPs	10,000 (evicts oldest 2,000)

Request Limits¶

Limit	Default
Max body size	10 MB
Max sessions	64

Error Sanitization¶

Error responses are sanitized — stack traces and file paths are stripped to prevent information leakage.

Response Headers¶

X-Content-Type-Options: nosniff
Cache-Control: no-store

Configuration¶

All sidecar options:

crp serve [OPTIONS]

Options:
  --host TEXT        Bind address (default: 127.0.0.1)
  --port INT         Port (default: 8900)
  --auth-token TEXT  Bearer token for authentication
  --bind-all         Bind to 0.0.0.0 (requires --auth-token)
  --max-sessions INT Maximum concurrent sessions (default: 64)
  --max-body-mb INT  Maximum request body in MB (default: 10)
  --rate-limit INT   Requests per minute per IP (default: 120)

OpenAPI Schema¶

The full OpenAPI 3.1.0 specification is available at schemas/openapi.json in the repository. Import it into Postman, Swagger UI, or any OpenAPI-compatible tool.