hommes-r (u/hommes-r)

hommes-r commented on Ask HN: Who is hiring? (December 2025) · Posted by u/whoishiring

hommes-r · 12 days ago

About the company:

• Technical founding team with relevant industry experience

• Backed by well-known European VCs (SpeedInvest + Galion.exe)

• Cloud native & freedom to shape our tech stack (TypeScript + Python)

Ideal candidate:

• Previous experience at start-up building tech at scale

• Thinks in terms of product functionality and customer demands not just features

• Familiar with API first practices and frameworks

• Bonus points if you are an ex-founder or have been first hire before

Moyai is an AI-powered agent monitoring tool for AI engineers looking to catch agent failures in production. Reach out to the founder directly: https://www.linkedin.com/in/rhommes/ or visit our website https://moyai.ai

*No agencies or recruiters, and we are unable to provide visa sponsorship

hommes-r commented on Agent design is still hard lucumr.pocoo.org/2025/11/... · Posted by u/the_mitsuhiko

ReDeiPirati · 22 days ago

> We find testing and evals to be the hardest problem here. This is not entirely surprising, but the agentic nature makes it even harder. Unlike prompts, you cannot just do the evals in some external system because there’s too much you need to feed into it. This means you want to do evals based on observability data or instrumenting your actual test runs. So far none of the solutions we have tried have convinced us that they found the right approach here.

I'm curious about the solutions the op has tried so far here.

hommes-r · 21 days ago

"Because there’s too much you need to feed into it" - what does the author mean by this? If it is the amount of data, then I would say sampling needs to be implemented. If that's the extent of the information required from the agent builder, I agree that an LLM-as-a-judge e2e eval setup is necessary.

In general, a more generic eval setup is needed, with minimal requirements from AI engineers, if we want to move forward from Vibe's reliability engineering practices as a sector.

hommes-r commented on LangChain Cost Optimization with Model Cascading github.com/lemony-ai/casc... · Posted by u/saschabuehrle

saschabuehrle · 22 days ago

The Hidden ROI Problem with LangChain Agents

After analyzing hundreds of production agent workflows, we discovered something: 40-70% of agent tool calls and text prompts don't need expensive flagship models. Yet most implementations route everything through their selected flagship model.

Here's what that looks like in practice:

A customer support agent handling 1,000 queries/day: - Current cost: ~$225/month - Actual need: 60% could use smaller or domain specific models (faster, cheaper) - Wasted spend: $135/month per agent

A data analysis agent making 5,000 tool calls/day: - Current cost: ~$1,125/month - Actual need: 70% are simple operations - Wasted spend: $787/month

Multiply this across multiple agents, and you're looking at hundreds in unnecessary costs per month.

The root cause? Agent frameworks don't differentiate between "check database status" and "analyze complex business logic" - they treat every call the same.

The Solution: Intelligent Model Cascading

We built CascadeFlow's LangChain integration as a drop-in replacement that:

1. Tries fast, cheap models first 2. Validates response quality automatically 3. Escalates to flagship models only when needed 4. Tracks costs per query in real-time

The integration is dead simple - it works exactly like any LangChain chat model. No architecture changes. Just swap your chat model for CascadeFlow.

What you get: - Full LCEL chain support - Streaming and tool calling - LangSmith tracing out of the box - 40-85% cost reduction - 2-10x faster responses for simple queries - Zero quality loss

Real production results from teams already using it.

Open source, MIT licensed. Takes 5 minutes to integrate.

hommes-r · 21 days ago

Just another example of money scaling your way out of a problem. What you don't understand is hard to optimize. Like how you have solved this by acting as an smart router in between that first understands what to optimize and then actually implement that optimization.

hommes-r commented on Ask HN: Who is hiring? (November 2025) · Posted by u/whoishiring

hommes-r · a month ago

About the company:

• Technical founding team with relevant industry experience (part of Nvidia Inception Program)

• Backed by well-known European VCs (SpeedInvest + Galion.exe)

• Building infrastructure-less agentic evals and agent-as-a-judge monitoring

Ideal candidate:

• Demonstrated shipping ability with past projects & roles

• "Young and hungry” mindset, prioritising ability to learn with agency over experience

• Familiar with fine-tuning algorithms and frameworks, transformers/trl, ART, verl and unsloth

• Bonus points for experience in contributing to open-source projects, startups, AI agents, & similar technologies

Reach out to the founder directly: https://www.linkedin.com/in/rhommes/ or visit our website https://moyai.ai

hommes-r commented on The case for the return of fine-tuning welovesota.com/article/th... · Posted by u/nanark

deepsquirrelnet · 2 months ago

I go back and forth on this. A year ago, I was optimistic and I have had 1 case where RL fine tuning a model made sense. But while there are pockets of that, there is a clash with existing industry skills. I work with a lot of machine learning engineers and data scientists and here’s what I observe.

- many, if not most MLEs that got started after LLMs do not generally know anything about machine learning. For lack of clearer industry titles, they are really AI developers or AI devops

- machine learning as a trade is moving toward the same fate as data engineering and analytics. Big companies only want people using platform tools. Some ai products, even in cloud platforms like azure, don’t even give you the evaluation metrics that would be required to properly build ml solutions. Few people seem to have an issue with it.

- fine tuning, especially RL, is packed with nuance and details… lots to monitor, a lot of training signals that need interpretation and data refinement. It’s a much bigger gap than training simpler ML models, which people are also not doing/learning very often.

- The limited number of good use cases means people are not learning those skills from more senior engineers.

- companies have gotten stingy with sme-time and labeling

What confidence do companies have in supporting these solutions in the future? How long will you be around and who will take up the mantle after you leave?

AutoML never really panned out, so I’m less confident that platforming RL will go any better. The unfortunate reality is that companies are almost always willing to pay more for inferior products because it scales. Industry “skills” are mostly experience with proprietary platform products. Sure they might list “pytorch” as a required skill, but 99% of the time, there isn’t hardly anyone at the company that has spent any meaningful time with it. Worse, you can’t use it, because it would be too hard to support.

hommes-r · 2 months ago

My personal opinion is that true engineering, which revolves around turning complex theory into working practice, has seen a decline in grace. Why spend a lot of time trying to master the art of engineering if you can ride the wave of engineering services and get away with it?

In true hacker spirit, I don't think trying to train a model on a wonky GPU is something that needs an ROI for the individual engineer. It's something they do because they yearn to acquire knowledge.

hommes-r commented on Ask HN: Who is hiring? (October 2025) · Posted by u/whoishiring

hommes-r · 2 months ago

About the company:

• Technical founding team with relevant industry experience (part of Nvidia Inception Program)

• Backed by well-known European VCs (SpeedInvest + Galion.exe)

• Building agentic advanced analytics and fine-tuning analytical reasoning models

Ideal candidate:

• Demonstrated shipping ability with past projects & roles

• "Young and hungry” mindset, prioritising ability to learn with agency over experience

• Familiar with fine-tuning algorithms and frameworks, transformers/trl, ART, verl and unsloth

• Bonus points for experience in contributing to open-source projects, startups, AI agents, & similar technologies

Reach out to the founder directly: https://www.linkedin.com/in/rhommes/

hommes-r commented on Launch HN: Airweave (YC X25) – Let agents search any app github.com/airweave-ai/ai... · Posted by u/lennertjansen

hommes-r · 2 months ago

Awesome to see the Cursor Airweave example!

hommes-r commented on GitHub Community Discussions: Past year's top 2 requests are to disable Copilot github.com/orgs/community... · Posted by u/carodgers

hommes-r · 3 months ago

Sad to see the "change world GDP" mantra didn't trickle down to the people doing the actual plumbing.