Nobody has built a human so we don’t know how they work
We know exactly how LLM technology works
> We were often surprised by what we saw in the model
https://www.anthropic.com/research/tracing-thoughts-language...
That said, I do like having an LLM that I can treat like the crappy bosses on TV treat their employees. When it gets something totally wrong I can yell at it and it'll magically figure out the right solution, but still keep a chipper personality. That doesn't work with humans.
I bet you’ve taken a shortcut to save less than 1h for example.
A saying I've heard is that if the punishment for a crime is financial, then it is only a deterrent for those who lack the means to pay. Small business gets caught doing bad stuff, a $30k fine could mean shutting down. Meta gets caught doing bad stuff, a billion dollar fine is almost a rounding error in their operational expenses.
Maybe I just answered my own question.
You now have Apple Fitness+, Apple TV, News, Music, Arcade. None of these are of any quality of what Apple used to be. It is really sad.
Oh and the most iconic thing? Apple was the one who tried to kill internet ads between 2017 - 2020.
When such benchmarks aren’t available what you often get instead is teams creating their own benchmark datasets and then testing both their and existing models’ performance against it. Which is eve worse because they probably still the rest multiple times (there’s simply no way to hold others accountable on this front), but on top of that they often hyperparameter tune their own model for the dataset but reuse previously published hyperparameters for the other models. Which gives them an unfair advantage because those hyperparameters were tuned to a doffeeent dataset and may not have even been optimizing for the same task.
But even an imperfect yardstick is better than no yardstick at all. You’ve just got to remember to maintain a healthy level of skepticism is all.
This doesn't make corruption OK. But he tore out a lifeline for some people without giving them an alternative way to get aid.
Everything you said right now holds equally true for chemical engineering and biomedical engineering so like you need get some experience
That doesn't mean complex systems never behaved unexpectedly, but the engineering goal was explicit determinism wherever possible: predictable execution, bounded failure modes, reproducible debugging. That tradition carried through operating systems, compilers, finance software, avionics, etc.
What is newer is our comfort with probabilistic or emergent systems, especially in AI/ML. LLMs are deterministic mathematically, but in practice they behave probabilistically from a user perspective, which makes them feel different from classical algorithms.
So I'd frame it less as "determinism is new" and more as "we're now building more systems where strict determinism isn't always the primary goal."
Going back to the original point, getting educated on LLMs will help you demystify some of the non-determinism but as I mentioned in a previous comment, even the people who literally built the LLMs get surprised by the behavior of their own software.