eevmanu (u/eevmanu) - Readit News

eevmanu commented on What fact do you wish everyone understood? · Posted by u/iambateman

eevmanu · a month ago

understand digital advertisement incentives and modern marketing tactics and techniques

https://news.ycombinator.com/item?id=40675527

https://news.ycombinator.com/item?id=43187603

maybe this could help people to be more thoughtful on how they invest their attention

eevmanu commented on Ask HN: Claude Code–style agent, but Aider-like and model-agnostic? · Posted by u/dsrtslnd23

dkhenry · a month ago

Using this project you can hook Claude Code up to different models

https://github.com/musistudio/claude-code-router

eevmanu · a month ago

claude code router is a great alternative

another one could be https://github.com/opencode-ai/opencode

eevmanu commented on Coding with LLMs in the summer of 2025 – an update antirez.com/news/154... · Posted by u/antirez

dakiol · a month ago

I see a big difference: I do use Jetbrains IDEs (they are nice), but I can switch to vim (or vscode) any time if I need to (e.g., let's say Jetbrains increase their price to a point that doesn't make sense, or perhaps they introduce a pervasive feature that cannot be disabled). The problem with paid LLMs is that one cannot easily switch to open-source ones (because they are not as good as the paid ones). So, it's a dependency that cannot be avoided, and that's imho something that shouldn't be overlooked.

eevmanu · a month ago

Open-weight and open-source LLMs are improving as well. While there will likely always be a gap between closed, proprietary models and open models, at the current pace the capabilities of open models could match today’s closed models within months.

eevmanu commented on Building Rq: A Fast Parallel File Search Tool for Windows in Modern C github.com/seeyebe/rq... · Posted by u/seeyebe

eevmanu · a month ago

This tool looks great. Congrats for the making of this tool. I would like to know if there is any internal file that could be used to create a benchmark from it in order to compare how fast it gets results in comparison with Void Everything, which is the most common app if a Windows user would like to improve file search.

eevmanu commented on Ask HN: What makes you keep coming back to Hacker News? · Posted by u/FerkiHN

eevmanu · 2 months ago

I get nerd sniped every day.

eevmanu commented on Show HN: Spegel, a Terminal Browser That Uses LLMs to Rewrite Webpages simedw.com/2025/06/23/int... · Posted by u/simedw

eevmanu · 2 months ago

great POC

looks very similar to a chrome extension i use for a similar goal: reader view - https://chromewebstore.google.com/detail/ecabifbgmdmgdllomnf...

eevmanu commented on Jepsen: TigerBeetle 0.16.11 jepsen.io/analyses/tigerb... · Posted by u/aphyr

aphyr · 3 months ago

Yeah, TigerBeetle's blog post goes into more detail here, but in short, the tests that were running in Antithesis (which were remarkably thorough) didn't happen to generate the precise combination of intersecting queries and out-of-order values that were necessary to find the index bug, whereas the Jepsen generator did hit that combination.

There are almost certainly blind spots in the Jepsen test generators too--that's part of why designing different generators is so helpful!

eevmanu · 3 months ago

Thanks for your answer aphyr and for this amazing analysis

eevmanu commented on Jepsen: TigerBeetle 0.16.11 jepsen.io/analyses/tigerb... · Posted by u/aphyr

eevmanu · 3 months ago

I have a question that I hope is not misinterpreted, as I'm asking purely out of a desire to learn. I am new to distributed systems and fascinated by deterministic simulation testing.

After reading the Jepsen report on TigerBeetle, the related blog post, and briefly reviewing the Antithesis integration code on GitHub workflow, I'm trying to better understand the testing scope.

My core question is: could these bugs detected by the Jepsen test suite have also been found by the Antithesis integration?

This question comes from a few assumptions I made, which may be incorrect:

- I thought TigerBeetle was already comprehensively tested by its internal test suite and the Antithesis product.

- I had the impression that the Antithesis test suite was more robust than Jepsen's, so I was surprised that Jepsen found an issue that Antithesis apparently did not.

I'm wondering if my understanding is flawed. For instance:

1. Was the Antithesis test suite not fully capable of detecting this specific class of bug?

2. Was this particular part of the system not yet covered by the Antithesis tests?

3. Am I fundamentally comparing apples and oranges, misunderstanding the different strengths and goals of the Jepsen and Antithesis testing suites?

I would greatly appreciate any insights that could help me understand this better. I want to be clear that my goal is to educate myself on these topics, not to make incorrect assumptions or assign responsibility.

eevmanu commented on O4-mini vs. Claude 3.7 vs. Gemini 2.5 Pro on code generation wandb.ai/byyoung3/Generat... · Posted by u/byyoung3

eevmanu · 4 months ago

great article!

is it possible to mention in the article the cost per model to run that benchmark?

eevmanu commented on Ask HN: How to unit test AI responses? · Posted by u/bikamonki

senordevnyc · 5 months ago

You need evals. I found this post extremely helpful in building out a set of evals for my AI product: https://hamel.dev/blog/posts/evals/

eevmanu · 5 months ago

+1 to evals

https://github.com/anthropics/courses/tree/master/prompt_eva...

https://cookbook.openai.com/examples/evaluation/getting_star...