dontreact (u/dontreact)

dontreact commented on Many lung cancers are now in nonsmokers nytimes.com/2025/07/22/we... · Posted by u/alexcos

dontreact · 7 months ago

Lung cancer screening should be used more broadly and improved over time in a data driven fashion!

We can catch things early, it shouldn’t be limited to only for smokers.

dontreact commented on LLMs get lost in multi-turn conversation arxiv.org/abs/2505.06120... · Posted by u/simonpure

dontreact · 9 months ago

My take: multi turn evals are hard because to do it really correctly you have to simulate a user. This is not yet modeled well enough for multi turn to work as well as it could.

dontreact commented on HealthBench – An evaluation for AI systems and human health openai.com/index/healthbe... · Posted by u/mfiguiere

andy99 · 9 months ago

My sense is that these benchmarks are not realistic in terms of the way the model is used. People building specialized AI systems are not, in my experience, letting users just chat with a base model, they would have some variant of RAG plus some guardrails plus other stuff (like routing to pre-written answers for common question).

So what use case does this test setup reflect? Is there a relevant commercial use case here?

dontreact · 9 months ago

It tests people chatting to ChatGPT! That's a pretty big and important use case.

dontreact commented on Xkcd's "Is It Worth the Time?" Considered Harmful will-keleher.com/posts/it... · Posted by u/gcmeplz

dontreact · 9 months ago

The flip side of this is that for some tasks (especially in ml/ai), doing it manually at least a few times gives you a sense of what is correct and a better sense of detail.

For example, spending the time to label a few examples yourself instead of just blindly sending it out to labeling.

(Not always the case, but another thing to keep in mind besides total time saved and value of learning)

dontreact commented on Generative AI is not replacing jobs or hurting wages at all, say economists theregister.com/2025/04/2... · Posted by u/pseudolus

dontreact · 10 months ago

I think the methods here are highly questionable, and appear to be based on self report from a small amount of employees in Denmark 1 year ago.

The overall rate of participation in the labor work force is falling. I expect this trend to continue as AI makes the economy more and more dynamic and sets a higher and higher bar for participation.

Overall GDP is rising while labor participation rate is falling. This clearly points to more productivity with fewer people participating. At this point one of the main factors is clearly technological advancement, and within that I believe if you were to make a survey of CEOS and ask what technological change has allowed them to get more done with fewer people, the resounding consensus would definitely be AI

dontreact commented on I Cannot Be Technical fightforthehuman.com/why-... · Posted by u/mooreds

bluGill · 10 months ago

It may be a good thing to throw scripts off to someone else. Division of labor is a good thing. You cannot possibly learn everything to a good (not even high) standard. Even if you could, no lawyer would have themselves as a client - when a lawyer needs legal advice they go to a different lawyer because they want that different perspective: this is often a good perspective for other subjects as well.

The question is what you will/should learn for your limited time alive. Society needs well educated (I include things "street smarts" and apprenticeship in educated here) people in many different subjects. Some subjects are important enough everyone needs to learn them (reading, writing, arithmetic). Some subjects are nearly useless but fun (tinplate film photography) and so worth knowing.

Things like basic computer skills are raising to the level where the majority of people today need them. However I'm not sure that scripting is itself quite at that level. (though it is important enough that a significant minority should have them)

dontreact · 10 months ago

Looks like I needed another disclaimer:

I’m talking about a general trend I see in use of this term, not that it’s always a bad thing to say “I’m not technical so someone else should write the script”

I agree with everything you said!

Both things are happening in the world: people using this terminology to throw work at others needlessly, and people doing good division of labor.

dontreact commented on I Cannot Be Technical fightforthehuman.com/why-... · Posted by u/mooreds

dontreact · 10 months ago

I think that “I’m not technical” is often an excuse for throwing work at other people and frankly can be a form of learned helplessness. Nowadays, there is less and less reason to ask other people to write one off scripts/queries, you can ask AI for help and learn how to do that.

Since this is HN some disclaimers -no that’s not always what’s happening, when “not technical” is thrown around -no it’s not always appropriate to use AI instead of asking an expert

dontreact commented on OpenAI says it has evidence DeepSeek used its model to train competitor ft.com/content/a0dfedd1-5... · Posted by u/timsuchanek

riantogo · a year ago

Why would it cast any doubt? If you can use o1 output to build a better R1. Then use R1 output to build a better X1... then a better X2.. XN, that just shows a method to create better systems for a fraction of the cost from where we stand. If it was that obvious OpenAI should have themselves done. But the disruptors did it. It hindsight it might sound obvious, but that is true for all innovations. It is all good stuff.

dontreact · a year ago

Is there any evidence R1 is better than O1?

It seems like if they in fact distilled then what we have found is that you can create a worse copy of the model for ~5m dollars in compute by training on its outputs.

dontreact commented on How does cosine similarity work? tomhazledine.com/cosine-s... · Posted by u/tomhazledine

SomewhatLikely · a year ago

Something worth mentioning is that if your vectors all have the same length then cosine similarity and Euclidean distance will order most (all?) neighbors in the same order. Think of your query vector as a point on a unit sphere. The Euclidean distance to a neighbor will be a chord from the query point to the neighbor. Just as with the angle between the query-to-origin and the neighbor-to-origin vectors, the farther you move the neighbor from the query point on the surface of the sphere, the longer the chord between those points gets too.

EDIT: Here's a better treatment, and it is the case that they give the exact same orderings: https://ajayp.app/posts/2020/05/relationship-between-cosine-...

dontreact · a year ago

Cosine similarity is equal to the dot product of each vector normalized

dontreact commented on Inference is free and instant fume.substack.com/p/infer... · Posted by u/emregucerr

dontreact · a year ago

“In my humble opinion, these companies would not allocate a second of compute to lightweight models if they thought there was a straightforward way to achieve the next leap in reasoning capabilities.”

The rumour/reasoning I’ve heard is that most advances are being made on synthetic data experiments happening after post-training. It’s a lot easier and faster to iterate on these with smaller models.

Eventually a lot of these learnings/setups/synthetic data generation pipelines will be applied to larger models but it’s very unwieldy to experiment with the best approach using the largest model you could possibly train. You just get way fewer experiments per day done.

The models bigger labs are playing with seem to be converging to about what is small enough for a researcher to run an experiment overnight.