▸ v0.1 · self-hosted · agplv3

Cut your LLM bill
40–70%.

Drop-in proxy for Claude Code, Cursor, Aider, Codex CLI, Continue, OpenAI + Anthropic SDKs. Semantic cache, persistent memory, context compression. No silent model swaps.

open dashboard→github ↗AGPLv3

▸ free forever · bring your own provider keys

[drop-in][no code changes][byok][self-host or hosted][open source]

~/your-app/.envlive

ANTHROPIC_BASE_URL=https://synapse-proxy-pbmw.onrender.com
ANTHROPIC_AUTH_TOKEN=syn_xxxxxxxxxxxxx
# everything else: unchanged.

6.28s → 0.05s

latency on cache hit

$3.55

saved per 3 cache hits

115×

speed-up

200k

claude opus context

▸ how it works · 01

Four mechanisms.
One install.

/01

semantic cache

Embed every prompt. Near-identical requests return cached responses without ever touching the model. Repeat queries from a coding harness cost $0.

cosine ≥ 0.97

/02

persistent memory

Per-user fact store. Inject only the relevant snippets instead of resending the entire chat history every turn. GDPR-wipe with one call.

≈90% token reduction

/03

context compression

Long prompts get LLM-summarised. The compressed prefix is itself cached so you stop paying to re-encode the same context across calls.

4k → 800 tokens

/04

no silent swaps

Every request runs against the model the caller named. We never reroute Opus → Haiku to make a margin. Transparency is the point.

transparency by default

▸ install · 02

Two steps.
No code changes.

01 / point your harness

export ANTHROPIC_BASE_URL="https://synapse-proxy-pbmw.onrender.com"
export ANTHROPIC_AUTH_TOKEN="syn_xxxxxxxxxxxxx"
unset ANTHROPIC_API_KEY
claude

▸ Anthropic env vars. Add to ~/.zshrc or ~/.bashrc.

02 / watch the bill drop

tsharnessmodellatencyhitcost

11:42:03claude-codeopus-4.71.84s○$0.038

11:42:11claude-codeopus-4.70.05s◆$0.000

11:42:25cursorgpt-4o2.12s○$0.027

11:42:38claude-codeopus-4.70.05s◆$0.000

11:42:51aiderhaiku-4.50.82s○$0.004

11:43:02cursorgpt-4o0.04s◆$0.000

▸ compatibility matrix

Works with every harness
you already use.

~ $ installone command · auto-detects

# install once, wires every harness on your machine
npx synapse-llm@latest setup

# or pin a single harness
npx synapse-llm@latest setup claude-code

▸ or wire it manually — click a harness for the snippet

▸ claude codeAnthropic env vars. Add to ~/.zshrc or ~/.bashrc.

export ANTHROPIC_BASE_URL="https://synapse-proxy-pbmw.onrender.com"
export ANTHROPIC_AUTH_TOKEN="syn_xxxxxxxxxxxxx"
unset ANTHROPIC_API_KEY
claude

▸ architecture · 03

One process.
Two ports. Zero magic.

  ┌──────────────┐      ┌──────────────────────────────────────────┐      ┌────────────────┐
  │  YOUR APP    │  ─→  │  SYNAPSE  ·  cache · memory · compress   │  ─→  │  PROVIDER       │
  │  any harness │      │  openai-compat  +  anthropic-compat       │      │  anthropic · …  │
  └──────────────┘      └──────────────────────────────────────────┘      └────────────────┘
                ▲                          ▲                                            ▲
        │                          │                                            │
   1 line install            <5 ms overhead                            anthropic · openai
   one env var pair          per request                               google · groq · openrouter

Cut your LLM bill40–70%.

Four mechanisms.One install.