▸ v0.1 · self-hosted · agplv3

Cut your LLM bill
40–70%.

Drop-in proxy for Claude Code, Cursor, Aider, Codex CLI, Continue, OpenAI + Anthropic SDKs. Semantic cache, persistent memory, context compression. No silent model swaps.

▸ free forever · bring your own provider keys

[drop-in][no code changes][byok][self-host or hosted][open source]
~/your-app/.envlive
ANTHROPIC_BASE_URL=https://synapse-proxy-pbmw.onrender.com
ANTHROPIC_AUTH_TOKEN=syn_xxxxxxxxxxxxx
# everything else: unchanged.
6.28s → 0.05s
latency on cache hit
$3.55
saved per 3 cache hits
115×
speed-up
200k
claude opus context
▸ how it works · 01

Four mechanisms.
One install.

/01

semantic cache

Embed every prompt. Near-identical requests return cached responses without ever touching the model. Repeat queries from a coding harness cost $0.

cosine ≥ 0.97
/02

persistent memory

Per-user fact store. Inject only the relevant snippets instead of resending the entire chat history every turn. GDPR-wipe with one call.

≈90% token reduction
/03

context compression

Long prompts get LLM-summarised. The compressed prefix is itself cached so you stop paying to re-encode the same context across calls.

4k → 800 tokens
/04

no silent swaps

Every request runs against the model the caller named. We never reroute Opus → Haiku to make a margin. Transparency is the point.

transparency by default
▸ install · 02

Two steps.
No code changes.

01 / point your harness
export ANTHROPIC_BASE_URL="https://synapse-proxy-pbmw.onrender.com"
export ANTHROPIC_AUTH_TOKEN="syn_xxxxxxxxxxxxx"
unset ANTHROPIC_API_KEY
claude

Anthropic env vars. Add to ~/.zshrc or ~/.bashrc.

02 / watch the bill drop
tsharnessmodellatencyhitcost
11:42:03claude-codeopus-4.71.84s$0.038
11:42:11claude-codeopus-4.70.05s$0.000
11:42:25cursorgpt-4o2.12s$0.027
11:42:38claude-codeopus-4.70.05s$0.000
11:42:51aiderhaiku-4.50.82s$0.004
11:43:02cursorgpt-4o0.04s$0.000
▸ compatibility matrix

Works with every harness
you already use.

~ $ installone command · auto-detects
# install once, wires every harness on your machine
npx synapse-llm@latest setup

# or pin a single harness
npx synapse-llm@latest setup claude-code

▸ or wire it manually — click a harness for the snippet

claude codeAnthropic env vars. Add to ~/.zshrc or ~/.bashrc.
export ANTHROPIC_BASE_URL="https://synapse-proxy-pbmw.onrender.com"
export ANTHROPIC_AUTH_TOKEN="syn_xxxxxxxxxxxxx"
unset ANTHROPIC_API_KEY
claude
▸ architecture · 03

One process.
Two ports. Zero magic.

  ┌──────────────┐      ┌──────────────────────────────────────────┐      ┌────────────────┐
  │  YOUR APP    │  ─→  │  SYNAPSE  ·  cache · memory · compress   │  ─→  │  PROVIDER       │
  │  any harness │      │  openai-compat  +  anthropic-compat       │      │  anthropic · …  │
  └──────────────┘      └──────────────────────────────────────────┘      └────────────────┘
                ▲                          ▲                                            ▲
        │                          │                                            │
   1 line install            <5 ms overhead                            anthropic · openai
   one env var pair          per request                               google · groq · openrouter