UncommonRoute

Cut your API bill in half without giving up performance.

UncommonRoute plugs into Claude Code, Cursor, Codex, and the OpenAI SDK. It runs locally, analyzes task complexity, conversation structure, tool use, available models, and budget constraints, then routes each request to the right model for the job.

On a held-out 100-case SWE-bench Verified split, UncommonRoute solved 75/100 tasks vs 74/100 for Opus-only. With task quality matched, API cost dropped by 53%.

Quick Start · Dashboard · Clients · Benchmark · Privacy

pipx install uncommon-route
uncommon-route init

	Opus-only	UncommonRoute	Saved
Tasks solved	74 / 100	75 / 100	Matched
API cost	$54.73	$25.66	-53%

_{Numbers from a held-out 100-case SWE-bench Verified split in TwinRouterBench. Reproduction commands below.}

Quick Start

pipx install uncommon-route
uncommon-route init

init walks you through connection setup, saves credentials, and configures Claude Code, Codex, Cursor, or the OpenAI SDK. After setup, run a health check anytime:

uncommon-route doctor

No pipx? Inside a venv?

macOS: brew install pipx && pipx ensurepath
Ubuntu: sudo apt install pipx && pipx ensurepath
Fedora: sudo dnf install pipx && pipx ensurepath
Already inside a virtualenv: python3 -m pip install uncommon-route
Seeing an "externally managed environment" error: use pipx or a venv instead of forcing a system install.
Need a specific Python version: pipx install --python python3.12 uncommon-route

Visual Routing

UncommonRoute isn't just a pass-through proxy. The Dashboard records and explains every routing decision: whether the request was classified as simple, medium, or complex, which model was selected, its actual or estimated cost, and how to tune the policy.

uncommon-route serve
# -> http://localhost:8403/dashboard/

Page	What it does
Home	Live requests, complexity distribution, model choices, and cost changes
Playground	Type a prompt and preview complexity, confidence, estimated cost, and signal readout
Explain	Inspect each routing decision per session, including model, latency, and cost
Activity	See request complexity, served quality, transport paths, capability lanes, model usage, and cost distribution
Routing	Configure `auto` / `fast` / `best`, or set primary and fallback models per complexity tier
Models	Browse the active model pool, providers, capability tags, and input / output prices
Connections	Manage the primary upstream and BYOK provider keys, and verify connection status
Budget	Set per-request, hourly, or daily spend limits
Feedback	Mark routes as `too strong`, `just right`, or `too weak` to improve the local classifier

To kick the tires: type a prompt in Playground, inspect the predicted complexity, confidence, and cost estimate, then open Explain / Activity to trace real routing decisions.

Supported Clients

Client	Minimal setup	Notes
Claude Code	`export ANTHROPIC_BASE_URL="http://localhost:8403"`	Uses the Anthropic-compatible proxy
OpenAI SDK	`export OPENAI_BASE_URL="http://localhost:8403/v1"`	Use `uncommon-route/auto` as the model ID
Codex	`export OPENAI_BASE_URL="http://localhost:8403/v1"`	Uses the OpenAI-compatible API
Cursor	`export OPENAI_BASE_URL="http://localhost:8403/v1"`	No application code changes
OpenClaw	Install the plugin	See openclaw.ai

Claude Code also needs a placeholder token:

export ANTHROPIC_AUTH_TOKEN="not-needed"

OpenAI SDK example:

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8403/v1")
resp = client.chat.completions.create(
    model="uncommon-route/auto",
    messages=msgs,
)

How UncommonRoute Saves Money

The savings don't come from using less AI. They come from not sending easy requests to frontier models.

"hello"                         -> simple
"fix a typo in the README"       -> simple
"find and fix this failing test" -> medium
"refactor this 500-line module"  -> medium / complex
"design a distributed scheduler" -> complex

Simple requests go to lightweight models. Medium requests go to capable mid-tier models. Complex requests escalate to the strongest model you've configured. Each decision is made per request, so a single conversation isn't tied to one model.

Why UncommonRoute

If you use AI agents for coding every day, a lot of that spend goes toward work that doesn't need the most expensive model: typo fixes, small edits, simple test runs, short explanations.

UncommonRoute does one thing. It doesn't replace Claude Code, Cursor, or Codex, and doesn't try to make cheaper models smarter. It focuses on one decision:

Which model is the right fit for this request?

Routing happens locally and independently for each agent step. You can inspect every decision in the Dashboard instead of trusting a black-box proxy.

Highlights

Capability	Result
Local routing	The router runs locally; no extra hop through a cloud routing service
Per-request routing	Each agent step is routed independently instead of pinning the whole session to one model tier
Automatic model selection	Routes based on task difficulty, conversation structure, tool use, and provider availability
Explainable decisions	See complexity, confidence, signal readout, selected model, and cost for each route
Adjustable policy	Use `auto` / `fast` / `best`, or override simple / medium / complex with primary and fallback models
Spend caps	Set per-request, hourly, or daily API spend limits
Local feedback	Mark routes as too strong, just right, or too weak to improve the classifier locally
Drop-in integration	Claude Code, Cursor, Codex, OpenAI SDK, and OpenClaw work without application code changes

Benchmark

UncommonRoute is evaluated on TwinRouterBench: 970 router-visible prefixes from 520 instances across SWE-Bench, BFCL, mtRAG, QMSum, and PinchBench, with execution-verified target tier labels. The end-to-end validation below uses a 100-case held-out SWE-bench Verified split.

Matched task quality, 53% lower API cost

Policy	Tasks solved	API cost	vs Opus-only
Opus 4.6 only	74 / 100	$54.73	—
UncommonRoute	75 / 100	$25.66	-53%

Put another way: this isn't a "spend less, solve fewer tasks" trade-off. On this split, UncommonRoute matched Opus-only on tasks solved while cutting realized API spend by 53%.

"Tasks solved" means the number of successfully resolved tasks out of 100 held-out SWE-bench Verified cases. "API cost" is realized model-call spend and doesn't include the penalty cost reported in Table 4.

Reproduce

python -m pip install -e ".[dev]"
python -m pip install "git+https://github.com/CommonstackAI/TwinRouterBench.git"
python scripts/eval_v2.py --split holdout
python scripts/bench_overhead.py --iterations 50 --json

Routing Overhead

Local CPU, warm process:

Metric	Latency
p50	25.6ms
p90	32.1ms

Cold start loads the embedding model and can take a few seconds. After warm-up, a single route() call typically takes tens of milliseconds.

Privacy

Routing runs on your machine. Your prompts don't go through a separate routing service; they're sent only to the upstream provider you configure.

uncommon-route telemetry status

Diagnostic exports are local by default:

uncommon-route support bundle

The redacted support bundle is written to ~/.uncommon-route/support/. It leaves your machine only if you choose to share it.

Spend Caps

Set a hard ceiling on API spend:

uncommon-route spend set daily 20.00
uncommon-route spend status

You can also configure per-request, hourly, or daily limits in the Dashboard. Once a limit is reached, requests fall back to the lowest-cost available tier instead of failing outright.

How It Works

Each request runs through three local signals. The router first classifies task complexity, then picks the best model from your configured upstream.

Signal	What it looks at	Typical overhead
Metadata	Conversation structure, tool use, context depth	<1ms
Embedding	BGE classifier over the request, recent agent state, and metadata; KNN fallback when uncertain	~25-35ms
Structural	Text and conversation complexity; active only when needed, shadow-tracked otherwise	<1ms

The signals vote, and the ensemble decides the complexity class. The router then weighs capabilities, transport, upstream availability, and price. From the matching candidates, it picks the lowest-cost option. Unknown upstream pricing is handled conservatively.

Routing is per request / per agent step. The session isn't pinned to one model. Protocol constraints, such as Anthropic thinking continuations, are still respected.

UncommonRoute also learns from local feedback: high-confidence agreement grows the embedding index, while low-confidence predictions escalate instead of silently sending complex work to an underpowered model.

Who It's For

You use Claude Code, Cursor, Codex, or another coding agent every day.
Most of your spend goes to frontier models, but many requests don't need that tier.
You want lower API cost without sending prompts to an extra hosted router.
You need routing at request granularity, not one model choice for the entire session.
You want routing that is explainable, adjustable, and feedback-driven.

Who It's Not For

You only call LLMs occasionally and your bill is already small.
You expect a router to make low-cost models fundamentally more capable. UncommonRoute doesn't make that claim.
You want every request to use the strongest model, no matter what. You can use uncommon-route/best, but the savings will be smaller.

Advanced Configuration

Connect Providers

Commonstack (managed): one key gets you OpenAI, Anthropic, Google, xAI, MiniMax, Moonshot, and DeepSeek.

export UNCOMMON_ROUTE_UPSTREAM="https://api.commonstack.ai/v1"
export UNCOMMON_ROUTE_API_KEY="csk-your-key"
uncommon-route serve

BYOK provider keys: auto-routing only considers providers you've registered.

uncommon-route provider add openai     sk-...
uncommon-route provider add anthropic  sk-ant-...
uncommon-route provider add google     AIza...
uncommon-route serve

UncommonRoute doesn't automatically read OPENAI_API_KEY or ANTHROPIC_API_KEY. Use init, a saved connection, or one of the manual setup paths above.

Routing Modes

Mode	Model ID	Behavior
auto	`uncommon-route/auto`	Default mode; optimizes for quality per dollar
fast	`uncommon-route/fast`	Cost-first; prefers lower-cost models when quality is acceptable
best	`uncommon-route/best`	Quality-first; prefers the strongest available model

Provider Management

uncommon-route provider list
uncommon-route provider add <name> <api-key>
uncommon-route provider remove <name>

Supported providers: commonstack, openai, anthropic, google, xai, minimax, moonshot, deepseek.

Environment variables

Variable	Meaning
`UNCOMMON_ROUTE_UPSTREAM`	Upstream URL for the managed path, e.g. `https://api.commonstack.ai/v1`; ignored in BYOK mode
`UNCOMMON_ROUTE_API_KEY`	API key used with `UNCOMMON_ROUTE_UPSTREAM`; not a fallback for per-provider keys
`UNCOMMON_ROUTE_PORT`	Local proxy port, default 8403

Diagnostics

If you hit routing errors, upstream failures, or need to file an issue, export a redacted diagnostics bundle:

uncommon-route support bundle
uncommon-route support request <request_id>

The bundle includes recent traces, errors, stats, provider/config snapshots, and redacted local state. It's saved locally by default.

Stop and Uninstall

If it's running in the foreground, press Ctrl+C. If it's running as a daemon:

uncommon-route stop
uncommon-route logs --follow

To stop routing clients through UncommonRoute, remove the shell block added by init, then restart your terminal. Common locations include ~/.zshrc, ~/.bashrc, and ~/.config/fish/config.fish.

For the current shell only:

unset OPENAI_BASE_URL OPENAI_API_KEY ANTHROPIC_BASE_URL ANTHROPIC_AUTH_TOKEN ANTHROPIC_API_KEY

Uninstall:

pipx uninstall uncommon-route
# If installed inside a venv:
python3 -m pip uninstall uncommon-route

Remove local state, including connections, provider keys, logs, and traces:

rm -rf ~/.uncommon-route/

Development

git clone https://github.com/CommonstackAI/UncommonRoute.git
cd UncommonRoute
pip install -e ".[dev]"
python -m pytest tests -v

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 166 Commits
.codex/skills/uncommonroute-release		.codex/skills/uncommonroute-release
bench		bench
docker		docker
docs		docs
examples		examples
frontend		frontend
openclaw-plugin		openclaw-plugin
scripts		scripts
tests		tests
uncommon_route		uncommon_route
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
TELEMETRY.md		TELEMETRY.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UncommonRoute

Quick Start

Visual Routing

Supported Clients

How UncommonRoute Saves Money

Why UncommonRoute

Highlights

Benchmark

Matched task quality, 53% lower API cost

Reproduce

Routing Overhead

Privacy

Spend Caps

How It Works

Who It's For

Who It's Not For

Advanced Configuration

Connect Providers

Routing Modes

Provider Management

Diagnostics

Stop and Uninstall

Development

License

About

Uh oh!

Releases 23

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

UncommonRoute

Quick Start

Visual Routing

Supported Clients

How UncommonRoute Saves Money

Why UncommonRoute

Highlights

Benchmark

Matched task quality, 53% lower API cost

Reproduce

Routing Overhead

Privacy

Spend Caps

How It Works

Who It's For

Who It's Not For

Advanced Configuration

Connect Providers

Routing Modes

Provider Management

Diagnostics

Stop and Uninstall

Development

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 23

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages