paladin314159 (u/paladin314159)

paladin314159 commented on I miss thinking hard jernesto.com/articles/thi... · Posted by u/jernestomg

topspin · 8 days ago

I'm using LLMs to code and I'm still thinking hard. I'm not doing it wrong: I think about design choices: risks, constraints, technical debt, alternatives, possibilities... I'm thinking as hard as I've ever done.

paladin314159 · 7 days ago

I echo this sentiment. Even though I'm having Claude Code write 100% of the code for a personal project as an experiment, the need for thinking hard is very present.

In fact, since I don't need to do low-thinking tasks like writing boilerplate or repetitive tests, I find my thinking ratio is actually higher than when I write code normally.

paladin314159 commented on Terence Tao: At the Erdos problem website, AI assistance now becoming routine mathstodon.xyz/@tao/11559... · Posted by u/dwohnitmok

paulryanrogers · 3 months ago

Isn't there a risk that you're engaging with an inaccurate summarization? At some point inaccurate information is worse than no information.

Perhaps in low stakes situations it could at least guarantee some entertainment value. Though I worry that folks will get into high stakes situations without the tools to distinguish facts from smoothly worded slop.

paladin314159 · 3 months ago

I've been doing this a fair amount recently, and way I manage it is: first, give the LLM the PDF and ask it to summarize + provide high-level reading points. Then read the paper with that context to verify details, and while doing so, ask the LLM follow-up questions (very helpful for topics I'm less familiar with). Typically, everything is either directly in the original paper or verifiable on the internet, so if something feels off then I'll dig into it. Through the course of ~20 papers, I've run into one or two erroneous statements made by the LLM.

To your point, it would be easy to accidentally accept things as true (especially the more subjective "why" things), but the hit rate is good enough that I'm still getting tons of value through this approach. With respect to mistakes, it's honestly not that different from learning something wrong from a friend or a teacher, which, frankly, happens all the time. So it pretty much comes down to the individual person's skepticism and desire for deep understanding, which usually will reveal such falsehoods.

paladin314159 commented on DeepMind and OpenAI win gold at ICPC codeforces.com/blog/entry... · Posted by u/notemap

amluto · 5 months ago

I've contemplated this a bit, and I think I have a bit of an unconventional take:

First, this is really impressive.

Second, with that out of the way, these models are not playing the same game as the human contestants, in at least two major regards. First, and quite obviously, they have massive amounts of compute power, which is kind of like giving a human team a week instead of five hours. But the models that are competing have absolutely massive memorization capacity, whereas the teams are allowed to bring a 25-page PDF with them and they need to manually transcribe anything from that PDF that they actually want to use in a submission.

I think that, if you gave me the ability to search the pre-contest Internet and a week to prepare my submissions, I would be kind of embarrassed if I didn't get gold, and I'd find the contest to be rather less interesting than I would find the real thing.

paladin314159 · 5 months ago

> I think that, if you gave me the ability to search the pre-contest Internet and a week to prepare my submissions, I would be kind of embarrassed if I didn't get gold, and I'd find the contest to be rather less interesting than I would find the real thing.

I don't know what your personal experience with competitive programming is, so your statement may be true for yourself, but I can confidently state that this is not true for the VAST majority of programmers and software engineers.

Much like trying to do IMO problems without tons of training/practice, the mid-to-hard problems in the ICPC are completely unapproachable to the average computer science student (who already has a better chance than the average software engineer) in the course of a week.

In the same way that LLMs have memorized tons of stuff, the top competitors capable of achieving a gold medal at the ICPC know algorithms, data structures, and how to pattern match them to problems to an extreme degree.

paladin314159 commented on GPT-5: Key characteristics, pricing and system card simonwillison.net/2025/Au... · Posted by u/Philpax

makeramen · 6 months ago

But given the option, do you choose bigger models or more reasoning? Or medium of both?

paladin314159 · 6 months ago

If you need world knowledge, then bigger models. If you need problem-solving, then more reasoning.

But the specific nuance of picking nano/mini/main and minimal/low/medium/high comes down to experimentation and what your cost/latency constraints are.