GPT-5 non-thinking is labeled 52.8% accuracy, but o3 is shown as a much shorter bar, yet it's labeled 69.1%. And 4o is an identical bar to o3, but it's labeled 30.8%...
Creativity is taken from us at exponential rate. And I don't buy argument from people who are saying they are excited to live in this age. I can get that if that technology stopped at current state and remained to be just tools for our creative endeavours, but it doesn't seem to be an endgame here. Instead it aims to be a complete replacement.
Granted, you can say "you still can play musical instruments/paint pictures/etc for yourself", but I don't think there was ever a period of time where creative works were just created for sake of itself rather for sharing it with others at masse.
So what is final state here for us? Return to menial not-yet-automated work? And when this would be eventually automated, what's left? Plug our brains to personalized autogenerated worlds that are tailored to trigger related neuronal circuitry for producing ever increasing dopamine levels and finally burn our brains out (which is arguably already happening with tiktok-style leasure)? And how you are supposed to pay for that, if all work is automated? How economics of that is supposed to work?
Looks like a pretty decent explanation of Fermi paradox. No-one would know how technology works, there are no easily available resources left to make use of simpler tech and planet is littered to the point of no return.
How to even find the value in living given all of that?
One problem that would probably still linger is model agreeableness, i.e. despite preparation, models have a tendency to say yes to whatever you ask for, and everybody knows a good DM needs to know when to say no.
Loading parent story...
Loading comment...
[1] https://transformer-circuits.pub/2024/scaling-monosemanticit...
I'm not sure how these new models compare to the biggest and baddest models, but if price, speed, and reliability are a concern for your use cases I cannot recommend Mistral enough.
Very excited to try out these new models! To be fair, mistral-3-medium-0525 still occasionally produces gibberish ~0.1% of my use cases (vs gpt-5's 15% failure rate). Will report back if that goes up or down with these new models