semantic cache
Embed every prompt. Near-identical requests return cached responses without ever touching the model. Repeat queries from a coding harness cost $0.
Drop-in proxy for Claude Code, Cursor, Aider, Codex CLI, Continue, OpenAI + Anthropic SDKs. Semantic cache, persistent memory, context compression. No silent model swaps.
▸ free forever · bring your own provider keys
ANTHROPIC_BASE_URL=https://synapse-proxy-pbmw.onrender.com ANTHROPIC_AUTH_TOKEN=syn_xxxxxxxxxxxxx # everything else: unchanged.
Embed every prompt. Near-identical requests return cached responses without ever touching the model. Repeat queries from a coding harness cost $0.
Per-user fact store. Inject only the relevant snippets instead of resending the entire chat history every turn. GDPR-wipe with one call.
Long prompts get LLM-summarised. The compressed prefix is itself cached so you stop paying to re-encode the same context across calls.
Every request runs against the model the caller named. We never reroute Opus → Haiku to make a margin. Transparency is the point.
export ANTHROPIC_BASE_URL="https://synapse-proxy-pbmw.onrender.com" export ANTHROPIC_AUTH_TOKEN="syn_xxxxxxxxxxxxx" unset ANTHROPIC_API_KEY claude
▸ Anthropic env vars. Add to ~/.zshrc or ~/.bashrc.
# install once, wires every harness on your machine npx synapse-llm@latest setup # or pin a single harness npx synapse-llm@latest setup claude-code
▸ or wire it manually — click a harness for the snippet
export ANTHROPIC_BASE_URL="https://synapse-proxy-pbmw.onrender.com" export ANTHROPIC_AUTH_TOKEN="syn_xxxxxxxxxxxxx" unset ANTHROPIC_API_KEY claude
┌──────────────┐ ┌──────────────────────────────────────────┐ ┌────────────────┐
│ YOUR APP │ ─→ │ SYNAPSE · cache · memory · compress │ ─→ │ PROVIDER │
│ any harness │ │ openai-compat + anthropic-compat │ │ anthropic · … │
└──────────────┘ └──────────────────────────────────────────┘ └────────────────┘
▲ ▲ ▲
│ │ │
1 line install <5 ms overhead anthropic · openai
one env var pair per request google · groq · openrouter