Thats especially encouraging to me because those are all about generalization.
5 and 5.1 both felt overfit and would break down and be stubborn when you got them outside their lane. As opposed to Opus 4.5 which is lovely at self correcting.
It’s one of those things you really feel in the model rather than whether it can tackle a harder problem or not, but rather can I go back and forth with this thing learning and correcting together.
This whole releases is insanely optimistic for me. If they can push this much improvement WITHOUT the new huge data centers and without a new scaled base model. Thats incredibly encouraging for what comes next.
Remember the next big data center are 20-30x the chip count and 6-8x the efficiency on the new chip.
I expect they can saturate the benchmarks WITHOUT and novel research and algorithmic gains. But at this point it’s clear they’re capable of pushing research qualitatively as well.
We need a well managed set of immigration polices or country WILL take advantage of US. These are our military rivals and we sell our most advanced math, physics and engineering seats to the highest bidder. It’s a self districting disaster and it’s not just on us to treat people better.
Look at the rate of Indian asylum seekers in Canada to see the most extreme case. It happens anywhere you extend naivety and boundless good will.