The Python API example looks like it has been written by an LLM. You don't need to import json, you don't need to set the content type and it is good practice to use context managers ("with" statement) to release the connection in case of exceptions. Also, you don't gain anything by commenting variables with the name of the variable.
The following sample (probably) does the same thing and is almost half as short. I have not tested it because there is no signup (EDIT: I was mistaken, there actually is a "signup" behind the login link, which is Google or GitHub login, so the naming makes sense. I confused it with a previously more prominent waitlist link.)
import requests
# Your Hypermode Workspace API key
api_key = "<YOUR_HYP_WKS_KEY>"
# Use the Hypermode Model Router API endpoint
url = f"https://models.hypermode.host/v1/chat/completions"
headers = {"Authorization": f"Bearer {api_key}"}
payload = {
"model": "meta-llama/llama-4-scout-17b-16e-instruct",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is Dgraph?"},
],
"max_tokens": 150,
"temperature": 0.7,
}
# Make the API request
with requests.post(url, headers=headers, json=payload) as response:
response.raise_for_status()
print(response.json()["choices"][0]["message"]["content"])
To explain in the clearest terms: unlike the SS insignia, the lightning bolt in the logo has tapering at the bottom. The second element in the logo, the slash, does not have tapering at the bottom. The general shape of the logo is the same as the SS insignia: two diagonal elements side-by-side (which would be all good on its own). The mind tends to see repetition, so it has a tendency to "mix up" the two elements of the logo. The mind also has a tendency to remember similar things. Putting it all together, the logo has a chance to evoke the SS insignia.
I may just be reading too much Theweleit and W. Reich nowadays, but I think you'll get catch some flak for this logo if it becomes recognizable outside the tech milieu.
Thanks for the feedback-- I can say emphatically, that's not our intention in the least. We chose a lightning bolt to evoke speed, i.e., the "hyper" in Hypermode. I've asked design to take another look at the "H" logo.
What I'm seeing with Brokk (https://brokk.ai) is that models are not really interchangeable for code authoring. Even with frontier models like GP2.5 and Sonnet 3.7, Sonnet is significantly better about following instructions ("don't add redundant comments") while GP2.5 has more raw intelligence. So we're using litellm to create a unified API to consume but the premise of "route your requests to whatever model is responding fastest" doesn't seem that attractive.
But OpenRouter is ridiculously popular so it must be very useful for other use cases!
Agreed on swapping models for code-gen doesn't make sense. We're mostly indexed on GPT-4.1 for our AgentBuilder product. I haven't found it easy to move between models for code super effective.
The most popular use case we've seen from folks is on the iteration/experimentation phase of building an agent/tool. We made ModelRouter originally as an internal service for our "prompt to agent" product, where folks are trying a few dozen models/MCPs/tools/data/etc really quickly as they try to find a local maximum for some automation or job.
I think the value here is being able to have a unified API to access hosted open source models and proprietary models. And then being able to switch between models without changing any code. Model optionality was one of the factors Hypermode called out in the 12 Factor Agentic App: https://hypermode.com/blog/the-twelve-factor-agentic-app
Also, being able to use models from multiple services and open source models without signing up for another service / bring your own API key is a big accelerator for folks getting started with Hypermode agents.
Are there any of these tools which will use your evals to automatically recommend a model to use? Imagine if you didn't need to follow model releases anymore, and you just had a heuristic that would automatically select the right price/performance tradeoff. Maybe there's even a way to route queries differently to more expensive models depending on how tricky they are.
(This would be more for using models at scale in production as opposed to individual use for code authoring etc.)
We've been playing with that in the background. I can try to shoot you a preview in a few weeks. It works pretty well for reasoning tasks/NLP workloads but for workloads that need a "correct" answer, it's really tough to maintain accuracy when swapping models.
What we've seen most successful is making recommendations in the agent creation process for a given tool/workload and then leaving them somewhat static after creation.
The following sample (probably) does the same thing and is almost half as short. I have not tested it because there is no signup (EDIT: I was mistaken, there actually is a "signup" behind the login link, which is Google or GitHub login, so the naming makes sense. I confused it with a previously more prominent waitlist link.)
The post function calls request the request function which uses its own context manager that will call the close function of the connection object.
There's a waitlist for our prompt to agent product in the banner. That's a good call to update it to be more clear.
To explain in the clearest terms: unlike the SS insignia, the lightning bolt in the logo has tapering at the bottom. The second element in the logo, the slash, does not have tapering at the bottom. The general shape of the logo is the same as the SS insignia: two diagonal elements side-by-side (which would be all good on its own). The mind tends to see repetition, so it has a tendency to "mix up" the two elements of the logo. The mind also has a tendency to remember similar things. Putting it all together, the logo has a chance to evoke the SS insignia.
I may just be reading too much Theweleit and W. Reich nowadays, but I think you'll get catch some flak for this logo if it becomes recognizable outside the tech milieu.
But OpenRouter is ridiculously popular so it must be very useful for other use cases!
Agreed on swapping models for code-gen doesn't make sense. We're mostly indexed on GPT-4.1 for our AgentBuilder product. I haven't found it easy to move between models for code super effective.
The most popular use case we've seen from folks is on the iteration/experimentation phase of building an agent/tool. We made ModelRouter originally as an internal service for our "prompt to agent" product, where folks are trying a few dozen models/MCPs/tools/data/etc really quickly as they try to find a local maximum for some automation or job.
Also, being able to use models from multiple services and open source models without signing up for another service / bring your own API key is a big accelerator for folks getting started with Hypermode agents.
(This would be more for using models at scale in production as opposed to individual use for code authoring etc.)
What we've seen most successful is making recommendations in the agent creation process for a given tool/workload and then leaving them somewhat static after creation.
Feels a bit halting-problem-ish: can you tell if a problem is too hard for model A without being smarter than model A yourself?