Readit News logoReadit News
Posted by u/AMeckes 5 months ago
Show HN: Any-LLM – Lightweight router to access any LLM Providergithub.com/mozilla-ai/any...
We built any-llm because we needed a lightweight router for LLM providers with minimal overhead. Switching between models is just a string change : update "openai/gpt-4" to "anthropic/claude-3" and you're done.

It uses official provider SDKs when available, which helps since providers handle their own compatibility updates. No proxy or gateway service needed either, so getting started is pretty straightforward - just pip install and import.

Currently supports 20+ providers including OpenAI, Anthropic, Google, Mistral, and AWS Bedrock. Would love to hear what you think!

swyx · 5 months ago
> LiteLLM: While popular, it reimplements provider interfaces rather than leveraging official SDKs, which can lead to compatibility issues and unexpected behavior modifications

with no vested interest in litellm, i'll challenge you on this one. what compatibility issues have come up? (i expect text to have the least, and probably voice etc have more but for text i've had no issues)

you -want- to reimplement interfaces because you have to normalize api's. in fact without looking at any-llm code deeply i quesiton how you do ANY router without reimplementing interfaces. that's basically the whole job of the router.

AMeckes · 5 months ago
Both approaches work well for standard text completion. Issues tend to be around edge cases like streaming behavior, timeout handling, or new features rolling out.

You're absolutely right that any router reimplements interfaces for normalization. The difference is what layer we reimplement at. We use SDKs where available for HTTP/auth/retries and reimplement normalization.

Bottom line is we both reimplement interfaces, just at different layers. Our bet on SDKs is mostly about maintenance preferences, not some fundamental flaw in LiteLLM's approach.

delijati · 5 months ago
there is nothing lite in litellm ... i was experimenting (using as a lib) but ended using https://llm.datasette.io/en/stable/index.html btw. thanks @simonw for llm
scosman · 5 months ago
Yeah, official SDKs are sometimes a problem too. Together's included Apache Arrow, a ~60MB dependency, for a single feature (I patched to make it optional). If they ever lock dependency versions it could conflict with your project.

I'd rather a library that just used OpenAPI/REST, than one that takes a ton of dependencies.

chuckhend · 5 months ago
LiteLLM is quite battle tested at this point as well.

> it reimplements provider interfaces rather than leveraging official SDKs, which can lead to compatibility issues and unexpected behavior modifications

Leveraging official SDKs also does not solve compatibility issues. any_llm would still need to maintain compatibility with those offical SDKs. I don't think one way clearly better than the other here.

AMeckes · 5 months ago
That's true. We traded API compatibility work for SDK compatibility work. Our bet is that providers are better at maintaining their own SDKs than we are at reimplementing their APIs. SDKs break less often and more predictably than APIs, plus we get provider-implemented features (retries, auth refresh, etc) "for free." Not zero maintenance, but definitely less. We use this in production at Mozilla.ai, so it'll stay actively maintained.
amanda99 · 5 months ago
Being battle tested is the only good thing I can say about LiteLLM.
Szpadel · 5 months ago
I use litellm as my personal AI gateway, and from user point of view there is no difference if proxy uses official SDK or not, this might be benefit for proxy developers.

but I can give you one example: litellm recently had issue with handling deepseek reasoning. they broke implementation and while reasoning was missing from sync and streaming responses.

Deleted Comment

amanda99 · 5 months ago
I'm excited to see this. Have been using LiteLLM but it's honestly a huge mess once you peek under the hood, and it's being developed very iteratively and not very carefully. For example. for several months recently (haven't checked in ~a month though), their Ollama structured outputs were completely botched and just straight up broken. Docs are a hot mess, etc.
piker · 5 months ago
This looks awesome.

Why Python? Probably because most of the SDKs are python, but something that could be ported across languages without requiring an interpreter would have been really amazing.

Shark1n4Suit · 5 months ago
That's the key question. It feels like many of these tools are trying to solve a systems-level problem (cross-language model execution) at the application layer (with a Python library).

A truly universal solution would likely need to exist at a lower level of abstraction, completely decoupling the application's language from the model's runtime. It's a much harder problem to solve there, but it would be a huge step forward.

pzo · 5 months ago
for js/ts you have vercel aisdk [0], for c++ you have [1], for flutter/reactnative/kotlin there is [2]

[0] https://github.com/vercel/ai

[1] https://github.com/ClickHouse/ai-sdk-cpp

[2] https://github.com/cactus-compute/cactus

retrovrv · 5 months ago
we essentially built the gateway as a service rather than an SDK: https://github.com/portkey-AI/gateway
pglevy · 5 months ago
How does this differ from this project? https://github.com/simonw/llm
peskypotato · 5 months ago
From my understanding of Simon's project it only supports OpenAI and OpenAI-compatible models in addition to local model support. For example, if I wanted to use a model on Amazon Bedrock I'd have to first deploy (and manage) a gateway/proxy layer[1] to make it OpenAI-compatible.

Mozzila's project boosts of a lot of existing interfaces already, much like LiteLLM, which has the benefit of directly being able to use a wider range or supported models.

> No Proxy or Gateway server required so you don't need to deal with setting up any other service to talk to whichever LLM provider you need.

Now how it compares to LiteLLM, I don't have enough experience in either to tell.

[1] https://github.com/aws-samples/bedrock-access-gateway

delijati · 5 months ago
nexarithm · 5 months ago
I have been also working on very similar open source project for python llm abstraction layer. I needed one for my research job. I inspired from that and created one for more generic usage.

Github: https://github.com/proxai/proxai

Website: https://proxai.co/

sparacha · 5 months ago
There is liteLLM, OpenRouter, Arch (although that’s an edge/service proxy for agents) and now this. We all need a new problem to solve
CuriouslyC · 5 months ago
LiteLLM is kind of a mess TBH, I guess it's ok if you just want a docker container to proxy to for personal projects, but actually using it in production isn't great.
tom_usher · 5 months ago
I definitely appreciate all the work that has gone in to LiteLLM but it doesn't take much browsing through the 7000+ line `utils.py` to see where using it could become problematic (https://github.com/BerriAI/litellm/blob/main/litellm/utils.p...)
dlojudice · 5 months ago
> but actually using it in production isn't great.

I only use it in development. Could you elaborate on why you don't recommend using it in production?

honorable_coder · 5 months ago
the people behind envoy proxy built: https://github.com/katanemo/archgw - has the learnings of Envoy but natively designed to process/route prompts to agents and LLMs. Would be curious about your thoughts
wongarsu · 5 months ago
And all of them despite 80% of model providers offering an OpenAI compatible endpoint
troyvit · 5 months ago
I think Mozilla of all people would understand why standardizing on one private organization's way of doing things might not be best for the overall ecosystem. Building a tool that meets LLM providers where they are instead of relying on them to homogenize on OpenAI's choices seems like a great reason for this project.
swyx · 5 months ago
portkey as well which is both js and open source https://www.latent.space/p/gateway
pzo · 5 months ago
why provide link if there is not a single portkey keyword there?
ieuanking · 5 months ago
we are trying to apply model-routing to academic work and pdf chat with ubik.studio -- def lmk what you think
omneity · 5 months ago
Crazy timing!

I shipped a similar abstraction for llms a bit over a week ago:

https://github.com/omarkamali/borgllm

pip install borgllm

I focused on making it Langchain compatible so you could drop it in as a replacement. And it offers virtual providers for automatic fallback when you reach rate limits and so on.

nodesocket · 5 months ago
This is awesome, will give it a try tonight.

I’ve been looking for something a bit different though related to Ollama. I’d like a load balancing reverse proxy that supports queuing requests to multiple Ollama servers and sending requests only when a Ollama server is up and idle (not processing). Anything exist?