Here are a few key differentiators vs LiteLLM today:
- Nexus does MCP server aggregation and LLM routing - LiteLLM only does LLM routing
- The Nexus router is a standalone binary that can run with minimal TOML configuration and optionally Redis - LiteLLM is a whole package with dashboard, database etc.
- Nexus is written in Rust - LiteLLM is written in Python
That said, LiteLLM is an impressive project, but we're just getting started with Nexus so stay tuned for a steady barrage of feature launches the coming months:)
Also tooling, you can use aider which is ok. But claude code and gemini cli will always be superior and will only work correctly with their respective models.
For well defined tasks that Claude creates, I'll pass off execution to a locally run model (running in another Claude Code instance) and it works just fine. Not for every task, but more than you might think.
It's faster than I am, and it knows things like ffmpeg flags I don't care to memorize.
Even opencode running on a local model is decent at this.
We use feature flags. However, cleaning them up is something rarely done. It typically takes me ~3minutes to clean one up.
To clean up the flag:
1) delete the test where the flag is off
2) delete all the code setting the flag to on
3) anything getting the value of the flag is set to true
4) resolve all "true" expressions, cleaning up if's and now constant parameters.
5) prep a pull request and send it for review
This is all fully supported by the indexing and refactoring tooling in my IDE.
However, when I prompted the LLM with those steps (and examples), it failed. Over and over again. It would delete tests where the value was true, forget to resolve the expressions, and try to run grep/find across a ginormous codebase.
If this was an intern, I would only have to correct them once. I would correct the LLM, and then it would make a different mistake. It wouldn't follow the instructions, and it would use tools I told it to not use.
It took 5-10 minutes to make the change, and then would require me to spend a couple of minutes fixing things. It was at the point of not saving me any time.
I've got a TONNE of low-hanging fruit that I can't give to an intern, but could easily sick a tool as capable as an intern on. This was not that.
For repeating patterns I'll identify 1-3 commit hashes or PRs, reference them in a slash command, and keep the command up to date if/when edge cases occur.