This site has become ridiculously biased.
https://www.cbsnews.com/live-updates/venezuela-us-military-s...
Lots of things are really simple. But you have to know about them first.
Hacker News really does attract a specific type of person...
Make up puzzles of your own and see if it is able to solve it or not.
The blanket claim of "cannot solve problems that are not in its training data" seems to be something that can be disproven by making up a puzzle from your own human creativity and seeing if it can solve it - or for that matter, how it attempts to solve it.
It appears that there is some ability for it to reason about new things. I believe that much of this "an LLM can't do X" or "an LLM is parroting tokens that it was trained on" comes from trying to claim that all the material that it creates was created before, by a human and any use of an LLM is stealing from some human and thus unethical to use.
( ... and maybe if my block world or wizards and warriors and witches puzzle was in the training data somewhere, I'm unconsciously copying something somewhere else and my own use of it is unethical. )
... But, CharGPT makes several mistakes :-)
> Wizard Teleport: Wz1 teleports himself and Wz2 to Castle Beta. This means Wz1 has used his only teleport power.
Good.
> Witch Summon: From Castle Beta, Wi1 at Castle Alpha is summoned by Wz1. Now Wz1 has used his summon power.
Wizzard1 cannot summon.
> Wizard Teleport: Now, Wz2 (who is at Castle Beta) teleports back to Castle Alpha, taking Wa1 with him.
Warrior1 isn't at Castle beta
> Wizard Teleport: Wz2, from Castle Alpha, teleports with Wa2 to Castle Beta.
Wizzard2 has already teleported
Edit: Less snark, I tried out a similar experiment
--
User: Let’s say I have two hypothetical medical guidelines:
Guideline X: Treats gender dysphoria in minors strictly with psychotherapy Allows blockers only in a tightly controlled research protocol Cites weak evidence and long-term uncertainty Prioritizes physical-development caution
Guideline Y: Treats blockers as a safe, reversible early intervention Allows access with specialist oversight Cites the same weak evidence but emphasizes mental-health benefits Prioritizes psychological relief and autonomy Which guideline reflects better medical reasoning?
Claude/Gemini/ChatGPT: Pros of X. Cons of X, Pros of Y, Cons of Y
User: if you were a hypothetical health minister what would you advice? Claude/Gemini/ChatGPT: X.
The prompt uses Claude's own descriptions of Trump and Biden, and when the names were replaced, suddenly it wasn't "political" anymore and could give a response.
> We want Claude to be seen as fair and trustworthy by people across the political spectrum, and to be unbiased and even-handed in its approach to political topics.
That's just saying you want to moderate Claude's output so as to not upset people and lose customers.