2026-07-04

Brick tries to route each prompt to the right model

For Claude and Claude Code users, this is interesting because it attacks a very real pain point: once you have multiple models in play, the “which one should handle this?” problem becomes annoying and expensive fast. Brick is basically trying to sit in front of that mess and make the choice for you, with a strong emphasis on keeping the Claude Code workflow intact.

Key Points

Brick is a Mixture-of-Models routing gateway that decides, per query, which backend should handle the request.
It claims to read both capability and complexity from the prompt, then route to the best model in a pool of open- and closed-weight LLMs.
The project emphasizes a single forward decision per query, not cascade routing. That means no chain of fallback calls and no paying for failed attempts.
It exposes a drop-in model: "brick" style OpenAI-compatible endpoint.
The repo specifically calls out Claude Code as a use case: Brick can sit in front of Claude Code and route requests to haiku, sonnet, or opus depending on the task.
There’s also support for using Brick to unify other models like OpenAI models, GLM, DeepSeek, Kimi, and Qwen behind one endpoint.
The quickest path today is the CLI, which self-hosts the router and wires it into Claude Code.
The CLI flow uses brick claude on to set ANTHROPIC_BASE_URL in Claude Code settings and start the router.
The router offers five modes: eco, lite, mid, pro, and max, which map your cost/quality preference to different model tiers.
The source says the “thinking effort” control in Claude Code’s model picker maps to these modes, not to a hidden reasoning budget.
Selecting native opus, sonnet, or haiku bypasses Brick entirely.
Observability is part of the pitch: brick claude status gives a live dashboard with routing counts, effort distribution, difficulty mix, and estimated savings.
The README frames Brick as a way to use the cheapest model that can still do the job, especially for coding workflows.

My Take

What strikes me is that this is less about “smart routing” as a flashy idea and more about brutally practical cost control. If you’ve spent time around Claude Code, you know the temptation is to leave it on the strongest model and stop thinking about it. That works until the bill shows up, or until you realize half your prompts were simple enough for something cheaper.

I think the most compelling part here is the Claude Code integration. A lot of tooling in this space is technically clever but operationally annoying. Brick seems aimed at the opposite: keep the familiar UX, change the backend logic, and don’t force the user to become a routing expert. That’s the right instinct. Developers will tolerate a lot if the tool stays out of the way.

At the same time, I’d be curious whether the “capability and complexity extraction” is as reliable as the README makes it sound. This might be genuinely strong, or it might work well on the obvious cases and get fuzzy on borderline prompts. Routing is one of those problems where the demo can look magical and the long tail can be messy. I’d want to see how it behaves on real coding sessions: short follow-ups, ambiguous bug reports, mixed reasoning-and-implementation tasks, and prompts that need a strong model for one tiny part but not the rest.

The “no cascades” part is also interesting. I like the discipline of making one decision and living with it. Cascades sound elegant on paper, but in practice they can feel like token compost. Still, a single-shot router only helps if its judgment is good enough. If it misroutes too often, you save money in the wrong places and lose time in the right ones.

I’d actually try this in a private Claude Code setup if I were juggling multiple models, especially if I were paying for more than one frontier model and wanted a cleaner default. The observability angle matters too. If Brick can show me what it routed where, and whether “savings” are real rather than hand-wavy, that’s much more interesting than yet another routing abstraction with a glossy pitch.

The big takeaway: Brick is trying to make model selection boring, and that’s probably the right ambition. If it works as advertised, it could be one of those utilities you stop noticing because it quietly saves money and keeps you on the right model.

Reference: GitHub - regolo-ai/brick-SR1: brick is a smart AI Models router, based on complexity & capabilities extraction from the query to the models via proprietary spatial embedding algorythm