No wall yet and I think we might have crossed the threshold of models being as good or better than most engineers already.
GDPval will be an interesting benchmark and I'll happily use the new model to test spreadsheet (and other office work) capabilities. If they can going like this just a little bit further, much of the office workers will stop being useful.... I don't know yet how to feel about this.
Great for humanity probably but but for the individuals?
Yes, it's down from 40h/week to 3-5h/week on Max plan, effectively. A real bummer. See my comment here [1] regarding [2].
Speed of model just isn't the bottleneck for me.
Before it I used Opus 4.1, and before that Opus 4.0 and before that Sonnet 4.0 - which each have been getting slightly better. It's not like Sonnet 4.5 is some crazy step function improvement (but the speed over Opus is definitely nice)
Hopefully they’ll be able to fix this data corruption, although a backup on the same host isn’t really a backup. The whole system can have issues
1) battery warning above tabs in browser with no x to close it
2) WebKit bugs that make inputs and visual diverge so you have to click under the input to hit it
3) flickering email app when it’s opened