Deleted Comment
Deleted Comment
Deleted Comment
Deleted Comment
Deleted Comment
Problem is - LLMs pull answers from their behind, just like a lazy student on the exam. "Halucinations" is the word people use to describe this.
Those are extremely hard to spot - unless you happen to know the right answer already, at which point - why ask? And those are everywhere.
One example - recently there was quite a discussion about llm being able to understand (and answer) base16 (aka "hex") encoding on the fly, so I went on to try base64, gzipped base64, zstd-compressed base64, etc...
To my surprise, LLM got most of those encoding/compressions right, decoded/uncompressed the question, and answered it flawlessly.
But with few encodings, LLM detected base64 correctly, got compression algorithm correctly, and then... instead of decompressing, made up a completely different payload, and proceeded to answer that. Without any hint of anything sinister going.
We really need LLMs to reliably calculate and express confidence. Otherwise they will remain mere toys.