leemoore (u/leemoore)

leemoore commented on Gemini 3 Flash: Frontier intelligence built for speed blog.google/products/gemi... · Posted by u/meetpateltech

radicality · 2 days ago

How do I get Gemini to be more proactive in finding/double-checking itself against new world information and doing searches?

For that reason I still find chatgpt way better for me, many things I ask it first goes off to do online research and has up to date information - which is surprising as you would expect Google to be way better at this. For example, was asking Gemini 3 Pro recently about how to do something with a “RTX 6000 Blackwell 96GB” card, and it told me this card doesn’t exist and that I probably meant the rtx 6000 ada… Or just today I asked about something on macOS 26.2, and it told me to be cautious as it’s a beta release (it’s not). Whereas with chatgpt I trust the final output more since it very often goes to find live sources and info.

leemoore · 2 days ago

Gemini is bad at this sort of thing but I find all models tend to do this to some degree. You have to know this could be coming and give it indicators to assume that it’s training data is going to be out of date. And it must web search the latest as of today or this month. They aren’t taught to ask themselves “is my understanding of this topic based on info that is likely out of date” but understand after the fact. I usually just get annoyed and low key condescend to it for assuming its old ass training data is sufficient grounding for correcting me.

That epistemic calibration is is something they are capable of thinking through if you point it out. But they aren’t trained to stop and ask/check themselves on how confident do they have a right to be. This is a meta cognitive interrupt that is socialized into girls between 6 and 9 and is socialized into boys between 11-13. While meta cognitive interrupt to calibrate to appropriate confidence levels of knowledge is a cognitive skill that models aren’t taught and humans learn socially by pissing off other humans. It’s why we get pissed off st models when they correct ua with old bad data. Our anger is the training tool to stop doing that. Just that they can’t take in that training signal at inference time

leemoore commented on Gemini 3 Flash: Frontier intelligence built for speed blog.google/products/gemi... · Posted by u/meetpateltech

ralusek · 2 days ago

Yes it does. I never use Claude anymore outside of agentic tasks.

leemoore · 2 days ago

What demographic are you in that is leaving anthropic in mass that they care about retaining? From what I see Anthropic is targeting enterprise and coding.

Claude Code just caught up to cursor (no 2) in revenue and based on trajectories is about to pass GitHub copilot (number 1) in a few more months. They just locked down Deloitte with 350k seats of Claude Enterprise.

In my fortune 100 financial company they just finished crushing open ai in a broad enterprise wide evaluation. Google Gemini was never in the mix, never on the table and still isn’t. Every one of our engineers has 1k a month allocated in Claude tokens for Claude enterprise and Claude code.

There is 1 leader with enterprise. There is one leader with developers. And google has nothing to make a dent. Not Gemini 3, not Gemini cli, not anti gravity, not Gemini. There is no Code Red for Anthropic. They have clear target markets and nothing from google threatens those.

leemoore commented on Gemini 3 Flash: Frontier intelligence built for speed blog.google/products/gemi... · Posted by u/meetpateltech

Imustaskforhelp · 2 days ago

I agree with this observation. Gemini does feel like code-red for basically every AI company like chatgpt,claude etc. too in my opinion if the underlying model is both fast and cheap and good enough

I hope open source AI models catch up to gemini 3 / gemini 3 flash. Or google open sources it but lets be honest that google isnt open sourcing gemini 3 flash and I guess the best bet mostly nowadays in open source is probably glm or deepseek terminus or maybe qwen/kimi too.

leemoore · 2 days ago

Gemini isn't code red for Anthropic. Gemini threatens none of Anthropic's positioning in the market.

leemoore commented on Composer: Building a fast frontier model with RL cursor.com/blog/composer... · Posted by u/leerob

swyx · 2 months ago

> Sonnet 4.5 quality is about as low as I'm willing to go.

literally a 30 day old model and you've moved the "low" goalpost all the way there haha. funny how humans work

leemoore · 2 months ago

Not sure about parent, but my current bar is set by GPT-5 high in codex cli. Sonnet 4.5 doesn't quite get there in many of the use cases that are important to me. I still use sonnet for most less intelligence phases and tasks (until I get crunched by rate limits). But when it comes to writing the final coding prompt and the final verification prompt and executing a coder or a verifier that will execute and verify well it's GPT 5 high all the way. Even if sonnet is better at tool calling, GPT 5 High is just smarter and has better coding/engineering judgement and that difference is important to me. So I very much get the sentiment of not going below sonnet intelligence 4.5 for coding. It's where I draw the line too.

leemoore commented on AI was supposed to help juniors shine. Why does it mostly make seniors stronger? elma.dev/notes/ai-makes-s... · Posted by u/elmsec

bentt · 3 months ago

The best code I've written with an LLM has been where I architect it, I guide the LLM through the scaffolding and initial proofs of different components, and then I guide it through adding features. Along the way it makes mistakes and I guide it through fixing them. Then when it is slow, I profile and guide it through optimizations.

So in the end, it's code that I know very, very well. I could have written it but it would have taken me about 3x longer when all is said and done. Maybe longer. There are usually parts that have difficult functions but the inputs and outputs of those functions are testable so it doesn't matter so much that you know every detail of the implementation, as long as it is validated.

This is just not junior stuff.

leemoore · 3 months ago

My success and experience generally matches yours (and the authors'). Based on my experience over the last 6 months, nothing here around more senior developers getting more productivity and why is remotely controversial.

It's fascinating how a report like yours or theirs acts as a lightning rod for those who either haven't been able to work it out or have rigid mental models about how AI doesn't work and want to disprove the experience of those who choose to share their success.

A couple of points I'd add to these observations: Even if AI didn't speed anything up... even if it slowed me down by 20%, what I find is that the mental load of coding is reduced in a way that allows me to code for far more hours in a day. I can multitask, attend meetings, get 15 minutes to work on a coding task, and push it forward with minimal coding context reload tax.

Just the ability to context switch in and out of coding, combined with the reduced cognitive effort, would still increase my productivity because it allows me to code productively for many more hours per week with less mental fatigue.

But on top of that, I also antectodally experience the 2-5x speedup depending on the project. Occasionally things get difficult and maybe I only get a 1.2-1.5x speedup. But it's far easier to slot many more coding hours into the week as an experienced tech lead. I'm leaning far more on skills that are fast, intuitive abilities built up from natural talent and decades of experience: system design, technical design, design review, code review, sequencing dependencies, parsing and organizing work. Get all these things to a high degree of correctness and the coding goes much smoother, AI or no AI. AI gets me through all of these faster, outputs clear curated (by me) artifacts, and does the coding faster.

What doesn't get discussed enough is that effective AI-assisted coding has a very high skill ceiling, and there are meta-skills that make you better from the jump: knowing what you want while also having cognitive flexibility to admit when you're wrong; having that thing you want generally be pretty close to solid/decent/workable/correct (some mixture of good judgement & wisdom); communicating well; understanding the cognitive capabilities of humans and human-like entities; understanding what kind of work this particular human/human-like entity can and should do; understanding how to sequence and break down work; having a feel for what's right and wrong in design and code; having an instinct for well-formed requirements and being able to articulate why when they aren't well-formed and what is needed to make them well-formed.

These are medium and soft skills that often build up in experienced tech leads and senior developers. This is why it seems that experienced tech leads and senior developers embracing this technology are coming out of the gate with the most productivity gains.

I see the same thing with young developers who have a talent for system design, good people-reading skills, and communication. Those with cognitive flexibility and the ability to be creative in design, planning and parsing of work. This isn't your average developer, but those with these skills have much more initial success with AI whether they are young or old.

And when you have real success with AI, you get quite excited to build on that success. Momentum builds up which starts building those learning skill hours.

Do you need all these meta-skills to be successful with AI? No, but if you don't have many of them, it will take much longer to build sufficient skill in AI coding for it to gain momentum—unless we find the right general process that folks who don't have a natural talent for it can use to be successful.

There's a lot going on here with folks who take to AI coding and folks who dont. But it's not terribly surprising that it's the senior devs and old tech leads who tend to take to it faster.

leemoore commented on Coding with LLMs in the summer of 2025 – an update antirez.com/news/154... · Posted by u/antirez

apwell23 · 5 months ago

> Coding activities should be performed mostly with: Claude Opus 4

I've been going down to sonnet for coding over opus. maybe i am just writing dumb code

leemoore · 5 months ago

Same, if you dont give opus big enough problems it's more likely to go off the rails. Not much more likely but a little more likely

leemoore commented on Psychedelics and mental illness stuartritchie.substack.co... · Posted by u/DantesKite

leemoore · 4 years ago

One challenge around studying these drugs is efficacy is often linked to expectation. Some speculate 1 key mechanism at work is an amplification of the placebo effect. If this is the case then clearly testing of these drugs using traditional techniques to eliminate the placebo effect will be problematic

leemoore commented on Masks4All: Wear a mask to stop the spread of Coronavirus (Jeremy Howard) masks4all.co/... · Posted by u/roi

stronglikedan · 6 years ago

Non-medical masks don't keep stuff out. They just keep stuff in, so the only reason to wear it is to not spread your water droplets when you sneeze or cough. Medical masks keep stuff out, but are in short supply.

I think I'm helping more by going with the recommendations and not wearing a mask. Some people feel better wearing them, and that's fine too.

leemoore · 6 years ago

This binary assertion is fairly broadly given and I think it’s a narrow view that is a subtle form of unintentional disinformation that endangers. Any reduction of viral load on exposure is by nature better. The intensity of initial viral Old can help determine whether you get covid 19 and how strong it hits. It’s difficult to believe that some worth of cloth covering will have absolutely zero reduction of viral load. No you shouldn’t think it’s a comprehensive protection. But it’s silly to handwave away as useless as many do. Anything that limits spread should be encouraged. People still need to minimize going out, but if they have to, and have nothing else, dismissing all other masks as useless is not helpful.