I double checked and tested on AI Studio, since you can still access the previous model there:
>You should drive. >If you walk there, your car will stay behind, and you won't be able to wash it.
Thinking models consistently get it correct and did when the test was brand new (like a week or two ago). It is the opposite of surprising that a new thinking model continues getting it correct, unless the competitors had a time machine.
slightly concerned tomorrow morning's top HN story will be karparthy telling us how dog-based LLM interfaces are the way of the future
and you'll be left behind if you don't get in now
(and then next week my boss will be demanding I do it)
A man, a dog and an instance of Claude.
The dog writes the prompts for Claude, the man feeds the dog, and the dog stops the man from turning off the computer.