For example, muons should decay before they hit the ground, but they don't due to time dilation. We see the time dilation when observing the muons, but the muons don't, so you would think that for us, the muons make it to the ground but for the muon it would decay too fast. However, the muons experience length contraction, so they do make it to the ground from their viewpoint as well.
So cause and effect is preserved, even though we would disagree with the muon on the relativistic reason why it is preserved.
That doesn’t mean we won’t end up approximating one eventually, but it’s going to take a lot of real human thinking first. For example, ChatGPT writes code to solve some questions rather than reasoning about it from text. The LLM is not doing the heavy lifting in that case.
Give it (some) 3D questions or anything where there isn’t massive textual datasets and you often need to break out to specialised code.
Another thought I find useful is that it considers its job done when it’s produced enough reasonable tokens, not when it’s actually solved a problem. You and I would continue to ponder the edge cases. It’s just happy if there are 1000 tokens that look approximately like its dataset. Agents make that a bit smarter but they’re still limited by the goal of being happy when each has produced the required token quota, missing eg implications that we’d see instantly. Obviously we’re smart enough to keep filling those gaps.
I've been doing this as well, mentally I think of LLMs as the librarians of the internet.