People are far overconfident online for what they know. The definitive and confident tone most online commenters speak in should probably only be spoken by experts in their own fields.
I sometimes wonder if that is why LLMs can so confidently hallucinate — because they were trained on piles of overconfident human texts.
And that sentence right there is an example of what I mean. I could write 10 words, 100 words or 1,000 words adding caveats to "People want to listen to folks who are confident," but most people don't want to hear it and they'd tune out. But nine words, they'll listen to and use that, even if it's not right all the time.
This isn't just an "online" issue. Anecdotally, I'd say it's in human nature. I've read plenty lamenting how men are (over)confident at work and garner (unwarranted) success relative to less confident women. And IME, confidence at work is pretty successful, if only because folks _try_ the confident suggestion. The person with a host of caveats might have a better suggestion, but they are less confident in their result, which folks sense and shy away from.
And then there are casual situations (which most of "online" discourse is), where I regularly see strangers confidently offer one another advice which is usually received positively. A lot of the advice is wrong, but that doesn't really matter.
> I sometimes wonder if that is why LLMs can so confidently hallucinate — because they were trained on piles of overconfident human texts.
The LLMs that I have worked with have no concept of "true" and "false". They have no sense of confidence in what they sense.
They _phrase_ it definitively because that's what we want.
"What is the capital of Australia."
"The capital of Australia is Timbuktu."
The LLM doesn't know if that's true. It's just making a statement we asked it to make.
Exactly right, they do not have a concept of true and false as unsupervised learning simply makes them good at predicting the next token. But I think there is an over-confidence bias in the training data sample. On top of that, instruction tuning wants definitive answers, as you say. And finally, RLHF probably favors over-confident answers because people like that. From start to finish, over-confidence bias is everywhere — we both produce over-confident training data, and tune for over-confident answers.
Or, well... that's what I think. See, I've not trained an LLM, I have only read about it online, and very little in books I have on the topic. I did some machine learning exercises in university, and that's the extent of my practical knowledge. And as I say that, the impact of my words goes down, right? They are taken less seriously than if someone said all that stuff about LLMs but never said they don't have practical experience. And yet, this makes the information as it is presented more exact, the limitations are clear, so it is more useful.
More useful, but far less appealing... This is a really interesting topic.
People should be suspicious of statements regardless of tone. Conmen, hackers, cult members, job applicants, and AIs are all trying to trick people who only listen to tone.
Some nice archaeology here, but I think it's important to say "pid 0 is part of the [Linux] kernel" (much less the further details) is only useful from a certain perspective—if you are debugging the kernel itself, using its more idiosyncratic interfaces like trace points within e.g. eBPF to examine the system as a whole, etc.
From the perspective of a userspace process using standard APIs, I think a more useful approximation is "pid 0 refers to myself". It's what fork returns in the child. It's what you pass in to kill(2) to signal your own entire process group. Probably other variations too.
There is no PID 0, it's a ABI convention just as using a negative PID is a convention to use the PGID instead like kill -9 -1. PIDs start at 1. The Linux kernel allocates PIDs to kernel threads that are "processes" without a separate address space running in a privileged mode with separate stacks, and generally ignore kill() signals.
Command to get when a Linux box was started as opposed to just running uptime -s:
In real life the difference should be rarely visible. And then, did we want to know when it started booting or when it became usable. The latter would be rather tricky (even defining what it exactly meant by that).
On NT-based Windows, PID 0 is "System Idle Process" and is quite similar in function to the Linux one. On DOS-based Windows, IIRC there is no such thing as PID 0 since PIDs there are actually kernel memory pointers and thus very high: http://www.thescarms.com/VBImages/RunningProcs.gif -- instead, the idle loop is inside VMM32.
Hah. Another topic where the "common knowledge" is just utter garbage and actual research yields a different picture. That doesn't stop people from being convinced of it.
The author of this post did the only correct thing and checked the kernel's source code, with is the authoritative source for this information.
This is very interesting. For those interested in following all the parts of early kernel booting that were out of scope for this article, please read this fantastic resource: https://0xax.gitbooks.io/linux-insides/content/
I sometimes wonder if that is why LLMs can so confidently hallucinate — because they were trained on piles of overconfident human texts.
It's an interesting thing to ponder.
And that sentence right there is an example of what I mean. I could write 10 words, 100 words or 1,000 words adding caveats to "People want to listen to folks who are confident," but most people don't want to hear it and they'd tune out. But nine words, they'll listen to and use that, even if it's not right all the time.
This isn't just an "online" issue. Anecdotally, I'd say it's in human nature. I've read plenty lamenting how men are (over)confident at work and garner (unwarranted) success relative to less confident women. And IME, confidence at work is pretty successful, if only because folks _try_ the confident suggestion. The person with a host of caveats might have a better suggestion, but they are less confident in their result, which folks sense and shy away from.
And then there are casual situations (which most of "online" discourse is), where I regularly see strangers confidently offer one another advice which is usually received positively. A lot of the advice is wrong, but that doesn't really matter.
> I sometimes wonder if that is why LLMs can so confidently hallucinate — because they were trained on piles of overconfident human texts.
The LLMs that I have worked with have no concept of "true" and "false". They have no sense of confidence in what they sense.
They _phrase_ it definitively because that's what we want.
"What is the capital of Australia."
"The capital of Australia is Timbuktu."
The LLM doesn't know if that's true. It's just making a statement we asked it to make.
Or, well... that's what I think. See, I've not trained an LLM, I have only read about it online, and very little in books I have on the topic. I did some machine learning exercises in university, and that's the extent of my practical knowledge. And as I say that, the impact of my words goes down, right? They are taken less seriously than if someone said all that stuff about LLMs but never said they don't have practical experience. And yet, this makes the information as it is presented more exact, the limitations are clear, so it is more useful.
More useful, but far less appealing... This is a really interesting topic.
People are extremely susceptible to someone who sounds confident.
Dead Comment
From the perspective of a userspace process using standard APIs, I think a more useful approximation is "pid 0 refers to myself". It's what fork returns in the child. It's what you pass in to kill(2) to signal your own entire process group. Probably other variations too.
Command to get when a Linux box was started as opposed to just running uptime -s:
In real life the difference should be rarely visible. And then, did we want to know when it started booting or when it became usable. The latter would be rather tricky (even defining what it exactly meant by that).
Deleted Comment
Deleted Comment
Deleted Comment
The author of this post did the only correct thing and checked the kernel's source code, with is the authoritative source for this information.
The conclusions at the end are a bit whacky.
For practical use I would prefer https://www.man7.org/linux/man-pages/man1/timeout.1.html
On Linux, `getppid` returns 0 if the parent is a process in another PID namespace.