He is comparing energy spend during inference in humans with energy spend during training in LLM's.
Humans spend their lifetimes training their brain so one would have to sum up the total training time if you are going to compare it to the training time of LLM's.
At age 30 the total energy use of the brain sums up to about 5000 Wh, which is 1440 times more efficient.
But at age 30 we didn't learn good representations for most of the stuff on the internet so one could argue that given the knowledge learned, LLMs outperform the brain on energy consumption.
That said, LLM's have it easier as they are already learning from an abstract layer (language) that already has a lot of good representations while humans have to first learn to parse this through imagery.
Half the human brain is dedicated to processing imagery, so one could argue the human brain only spend 2500 Wh on equivalent tasks which makes it 3000x more efficient.
Liked the article though, didn't know about HNSW's.
Edit: made some quick comparisons for inference
Assuming a human spends 20 minutes answering in a well-thought out fashion.
Human watt-hours: 0.00646
GPT-4 watt-hours (openAI data): 0.833
That makes our brains still 128x more energy efficient but people spend a lot more time to generate the answer.
Edit: numbers are off by 1000 as I used calories instead of kilocalories to calculate brain energy expense.
Corrected:
human brains are 1.44x more efficient during training and 0.128x (or 8x less efficient) during inference.
Not just that the brain of a newborn comes pretrained with billions of years of evolution. There is an energy cost associated with that which must be taken into account
Then you must also take that cost into account when calculating the cost of training LLMs, as well as the cost humans operating the devices and their respective individual brain development.
LLMs are always an additional cost, never more efficient because they add to the calculation, if you look at it that way.
Also take into consideration the speed of evolution. LLM training might be much faster because a lot of competition power is used for its training. Maybe if it was the same speed as evolution then it would take billions of years, too?
I don't think this is true personally, ideally as children, we spend out time having fun and learning about the world is a side effect. This borg like thinking applied to intelligence because we have LLMs is unusual to me.
I learned surfing through play and enjoyment, not through training like a robot.
We can train for something with intention, but I think that is mostly a waste of energy, albeit necessary on occasion.
> we spend out time having fun and learning about the world is a side effect
What do you think "play" is? Animals play to learn about themselves and the world, you see most intelligent animals play as kids with the play being a simplification of what they do as adults. Human kids similarly play fight, play build things, play cook food, play take care of babies etc, it is all to make you ready for an adult life.
Playing is fun since playing helps us learn, otherwise we wouldn't evolve to play, we would evolve to be like ants that just work all day long if that was more efficient. So the humans who played around beat those who worked their ass off, otherwise we would all be hard workers.
>we spend out time having fun and learning about the world is a side effect
I think the part of this that resonates as most true to me is how this reframes learning in a way that tracks truth more closely. It's not all the time, 100% of the time, it's in fits and starts, its opportunistic, and there are long intervals that are not active learning.
But the big part where I would phrase things differently is in the insistence that play in and of itself is not a form of learning. It certainly is, or certainly can be, and while you're right that it's something other than Borg-like accumulation I think there's still learning happening there.
Humans who spend a long time doing inference have not fully learned the thing being inferred - unlike LLMs, when we are undertrained, rather than a huge spike in error rate, we go slower.
When humans are well trained, human inference absolutely destroys LLMs.
> When humans are well trained, human inference absolutely destroys LLMs.
This isn't an apt comparison. You are comparing a human trained in a specific field to an LLM trained on everything. When an LLM is trained with a narrow focus as well, human brain cannot compete. See Garry Kasparov vs Deep Blue. And Deep Blue is very old tech.
The article is a bit of a stretch but this is even more of a stretch. Humans can do way more than an LLM, humans are never in only learning mode, our brains are always at least running our bodies as well, etc.
Exactly right - we are obviously not persistently in all-out training mode over the course of our lifetimes.
I suppose they intended that as a back-of-the-envelope starting point rather than a strict claim however. But even so, gotta be accountable to your starting assumptions, and I think a lot changes when this one is reconsidered.
Also, human brains come pre-trained by billions of years of evolution. It doesn't start as a randomly-connected structure. It already knows how to breathe, how to swallow, how to lean new things.
If we’re going to exclude the cortical areas associated with vision, you also need to exclude areas involved in motor control and planning. Those also account for a huge percent of the total brain volume.
We probably need to exclude the cerebellum as well (which is 50% of the neurons in the brain) as it’s used for error correction in movement.
Realistically you probably just need a few parts of the lambic system. Hippocampus, amygdala, and a few of the deep brain dopamine centers.
A lot of our cognition is mapped to areas that are used for something else, so excluding areas simply because they are used for something else is not valid. They can still be used for higher-level cognition. For example, we use the same area of the brain to process the taste of disgusting food as we do for moral disgust.
Thanks. So after your corrected energy estimate and more reasonable assumptions it appeaars that the clickbaity title of the article is off by more than 7 orders of magnitude. With the upcoming NVidia inference chips later this year it will be off by another log unit. It is hard for biomatter to compete with electrons in silicon and copper.
We can clone humans at current level of technology, otherwise there wouldn't be agreements about not doing it due to the ethical implications. Of course its just reproducing the initial hardware and not the memory contents or the changes in connections that happen at runtime.
How about the fact that llm's don't work unless humans generate all that data in the first place. I'd say the llm's energy usage is the amount it takes to train plus the amount to generate all that data. Humans are more efficient at learning with less data.
Humans also learn from other humans (we stand on the shoulders of giants), so we would need to account for all the energy that has gone into generating all of human knowledge in the 'human' scenario too.
i.e. not many humans invent calculus or relativity from scratch.
I think OP's point stands - these comparisons end up being overly hand-wavey and very dependent on your assumptions and view.
For every calorie a human consumes, hundreds or thousands more are used by external support systems.
So yeah, you do use 2000 calories a day, but unless you live in an isolated jungle tribe, vast amounts of energy are consumed on delivering you food, climate control, electricity, water, education, protection, entertainment and so on.
By that metric, the electricity is only part of it. The cost of building the harsware, the cost of building the roof and walls for the datacentre, the cost of clearing the land, cost of humans maintaining the hardware, the cost of all the labour making the linux kernel, libc6, etc, etc. Lots of additionals here too.
Are you going to include all the externalities to build and power the datacenters behind LLMs then? Because i guarantee those far outweigh what it takes to feed one human.
Including support from ChatGPT. It really is a comparison of calories without ChatGPT and calories with, and that gets to the real issue of whether ChatGPT justifies its energy intensity or not. History suggests we won't know until the technology exits the startup phase.
I've come to the conclusion that gpt and gemini and all the others are nothing but conversational search engines. They can give me ideas or point me in the right direction but so do regular search engines.
I like the conversation ability but, in the end, I cannot trust their results and still have to research further to decide for myself if their results are valid.
I’m a local LLM elite who stopped using chat mode whatsoever.
I just go into the notebook tab (with an empty textarea) and start writing about a topic I’m interested in, then hit generate. It’s not a conversation, just an article in a passive form. The “chat” is just a protocol of in a form of an article with a system prompt at the top and “AI: …\nUser: …\n” afterwards, all wrapped into a chat ui.
While the article is interesting, I just read it (it generates forever). When it goes sideways, I stop it and modify the text in a way that fits my needs, in a recent place or maybe earlier, and then hit generate again.
I find this mode superior to complaining to a bot, since wrong info/direction doesn’t spoil the content. Also you don’t have to wait or interrupt, it’s just a single coherent flow that you can edit when necessary. Sometimes I stop it at “it’s important to remember …” and replace it with a short disclaimer like “We talked about safety already. Anyway, back to <topic>” and hit generate.
Fundamentally, LLMs generate texts, not conversations. Conversations just happen to be texts. It’s something people forget / aren’t aware of behind these stupid chat interfaces.
One amusing way to put this is that LLMs energy requirements arent self-contained, since they use the energy of the human prompter to both prompt and verify the output.
Reminds me of a similar argument about correctly pricing renewable power: since it isnt always-on (etc.) it requires a variety of alternative systems to augment it which aren't priced in. Ie., converting entirely to renewables isnt possible at the advertised price.
In this sense, we cannot "convert entirely to LLMs" for our tasks, since there's still vast amounts of labour in prompt/verify/use/etc.
I can ask ChatGPT extremely specific programming questions and get working code solving it. This is not something I can do with a search engine.
Another thing a search engine cannot do that I use ChatGPT for on a daily basis is taking unstructured text and convert it into a specified JSON format.
I wish someone could explain me Bing. If you search on Bing, the first result appears BELOW the ChatGPT auto-generated message, and this message takes 10 seconds to be "typed" out.
I can click the first result 1 billion times faster.
I do agree that I rarely use Google now, I search into a chat to have a summary and this saves lot of aggregation from different sites.
The same for Stack Overflow, no use if I find the answer quicker.
It’s exactly that for me, a conversational search engine. And the article explains it right, it’s just words organized in very specific ways to be able to retrieve them with statistical accuracy and the transformer is the cherry on top to make it coherent
Replace "gpt and gemini and all the others" with "people" and funny enough your statement is still perfectly accurate.
You have a rough mathematical approximation of what's already a famously unreliable system. Expecting complete accuracy instead of about-rightness from it seems mad to me. And there are tons of applications where that's fine, otherwise our civilization wouldn't be here today at all.
These anthropomorphizations are increasingly absurd. There's a difference between a human making a mistake, and an AI arbitrarily and completely confidently creating entirely new code APIs, legal cases, or whatever that have absolutely no basis in reality whatsoever, beyond being what it thinks would be an appropriate next token based on what you're searching for. These error modes are simply in no way, whatsoever, comparable.
And then you tell it such an API/case/etc doesn't exist. And it'll immediately acknowledge its mistake, and ensure it will work to avoid such in the future. And then literally the next sentence in the conversation it's back to inventing the same nonsense again. This is not like a human because even with the most idiotic human there's an at least general trend to move forward - LLMs are just coasting back on forth based on their preexisting training with absolutely zero ability to move forward until somebody gives them a training set to coast back and forth on, and repeat.
I feel the author is comparing an abstract representation of the brain to a mechanical representation of a computer. This is not a fair or useful comparison.
If a computer does not understand words, neither does your brain. While electromagnetic charge in the brain does not at all correspond with electromagnetic charge in a GPU, they do share an abstraction level, unlike words vs bits.
Computers right now do not understand language, but that does not mean that they cannot. We don't know what it takes to bridge the gap from stochastic parrot to understanding in computers, however from the mistakes LLMs make right now, it appears we have not found it yet.
It is possible that silicon based computer architecture cannot support the processing and information storage density/latency to support understanding. It's hard to guage the likelihood this is true given how little we know about how understanding works in the brain.
The brain translates words into a matrix of cortical column neurons activations. So there are similarities to our naive implementation of such "thinking".
A brain is an electrochemical network made of cells; artificial neural networks are a toy model of these.
Each neurone is itself a complex combination of chemicals cycles; these can be, and have been, simulated.
The most complex chemicals in biology are proteins; these can be directly simulated with great difficulty, and we've now got AI that have learned to predict them much faster than the direct simulations on a classical computer ever could.
Those direct simulations are based on quantum mechanics, or at least computationally tractable approximations of it; QM is lots of linear algebra and either a random number generator or superdeterminism, either of which is still a thing a computer can do (even if the former requires a connection to a quantum-random source).
The open question is not "can computers think?", but rather "how detailed does the simulation have to be in order for it to think?"
And what gives brains this unique power? Do brains of lesser animals also have this unique “thinking” property? Is this “thinking” a result of how the brain is architected out of atoms and if so why can’t other machines emulate it?
Our brains are the product of the same dumb evolutionary process that made every other plant and animal and fungus and virus. We evolved from animals capable of only the most basic form of pattern recognition. Humans in the absence of education are not capable of even the most basic reasoning. It took us untold thousands of years to figure out that “try things and measure if it works” is a good way to learn about the world. An intelligent species would be able to figure things out by itself our ancestors, who have the same brain architecture we do, were not able to figure anything out for generation after generation. So much for our ability to do original independent thinking.
You’re holding on to a lost battle. We are biological computers. Maybe there’s something deeper behind it, like what some call a soul, but that’s hard to impossible to prove.
There is an immensely strong dogma that, to my best knowledge, is not founded in any science or philosophy:
First we must lay down certain axioms (smart word for the common sense/ground rules we all agree upon and accept as true).
One of such would be the fact that currently computers do not really understand words. ...
The author is at least honest about his assumptions. Which I can appreciate. Most other people just has it as a latent thing.
For articles like this to be interesting, this can not be accepted as an axiom. It's justification is what's interesting,
It’s a reasonable axiom, because for many people understanding involves qualia. If you believe LLM have qualia, you also believe a very large Excel sheet with the right numbers has an experience of consciousness and feels pain or something where the document is closed.
As I wrote, I appreciate that the author wrote it out as they did. It might be reasonable in the context of the article. But fixing it as an axiom just makes the discussion boring (for me).
> If you believe LLM have qualia, you also believe a ...
You use the word believe twice here. I am actively not talking about beliefs.
I just realise, that the author indeed gave themselves an out:
> ... currently computers do not really understand words.
The author might believe that future computers can understand words. This is interesting. Questions being _what_ needs to be in order for them to understand? Could that be an emergent feature of current architectures? That would also contradict large parts of the article.
Yeah, for axioms like the above my next question is define 'understand'. Does my dog understand words when it completes specific actions because of what I say? I'm also learning a new language, do I understand a word when I attach a meaning (often a bunch of other words to it) to it? Turns out computers can do this pretty well.
Oh please, enough with the semantics. It reminds me of a post modernist asking me to define what "is" is. The LLM does not understand words in the way a human understands them and that's obvious. Even the creators of LLMs implicitly take this as a given and would rarely openly say they think otherwise no matter how strong the urge to create a more interesting narrative.
Yes, we attach meaning to certain words based on previous experience, but we do so in the context of a conscious awareness of the world around us and our experiences within it. An LLm doesn't even have a notion of self, much less a mechanism for attaching meaning to words and phrases based on conscious reasoning.
Computers can imitate understanding "pretty well" but they have nothing resembling a pretty good or bad or any kind of notion of comprehension about what they're saying.
It's the most incredible coincidence. Three million paying OpenAI customers spend $20 per month (compare: NetFlix standard: $15.49/month) thinking they're chatting with something in natural language that actually understands what they're saying, but it's just statistics and they're only getting high-probability responses without any understanding behind it! Can you imagine spending a full year showing up to talk to a brick wall that definitely doesn't understand a word you say? What are the chances of three million people doing that! It's the biggest fraud since Theranos!! We should make this illegal! OpenAI should put at the bottom of every one of the millions of responses it sends each day: "ChatGPT does not actually understand words. When it appears to show understanding, it's just a coincidence."
You have kids talking to this thing asking it to teach them stuff without knowing that it doesn't understand shit! "How did you become a doctor?" "I was scammed. I asked ChatGPT to teach me how to make a doctor pepper at home and based on simple keyword matching it got me into medical school (based on the word doctor) and when I protested that I just want to make a doctor pepper it taught me how to make salsa (based on the word pepper)! Next thing you know I'm in medical school and it's answering all my organic chemistry questions, my grades are good, the salsa is delicious but dammit I still can't make my own doctor pepper. This thing is useless!
Maps are useful, but they don't understand the geography they describe. LLMs are maps of semantic structures and as such, can absolutely be useful without having an understanding of that which they map.
If LLMs were capable of understanding, they wouldn't be so easy to trick on novel problems.
i am not sure where this comment fits as an answer to my comment.
Firstly, do understand that I am not saying that LLMs (or ChatGPT) do understand.
I am merely saying that we don't have any sound frameworks to assess it.
For the rest of your rant: I definitely see that you don't derive any value from ChatGPT. As such I really hope you are not paying for it - or wasting your time on it. What other people decide to spend their money on is really their business. I don't think any normal functioning people have the expectation that a real person is answering them when they use ChatGPT - as such it is hardly a fraud.
I was expecting a simple trivial calculation with comparing energy demand for LLMs and energy demand of the brain and lots of blabla around it..
But it rather seems a good general introduction into the realm aimed at beginners. Not sure if it gets everything right and the author clearly states he is not an expert and would like correction where he is wrong, but it seems worth checking out, if one is interested in understanding a bit about the magic behind it.
That's a whole lot of hand waving. Also, field effect transistors deal with potential, not current. Current consumption stems mostly from charging and discharging parasitic capacitance. Also, computers do not really process individual bits. They operate on whole words. Pun intended.
Call me lazy but I couldn’t get through the wall of text to learn what on earth vectored database is. Way too much effort spent talking about binary and how ascii works and whatnot - such basics that it feels that the article is for someone with zero knowledge about computers.
indeed. its condescending and word vomity. i would flag it except that it doesnt break any rules, it is just badly written. as the author acknowledges it is a 4hr stream of consciousness word dump. title is clickbait relative to what it is, a vector db review piece with a long preamble to puff himself up
Genuinely curious who upvoted this and why. The title is clickbait, the writing is long and rambling and it seems to me like the author doesn't have a profound understand of the concepts either, all just to recommend Qdrant as a vector database.
Yeah, it seems almost insulting that the author expects countless people to spend time reading their posts, while they haven't spent a lot of time to edit and streamline it, all with the excuse: "these are just my ramblings".
To paraphrase, I will not excuse such a long letter, for you had more time to write a shorter one.
He is comparing energy spend during inference in humans with energy spend during training in LLM's.
Humans spend their lifetimes training their brain so one would have to sum up the total training time if you are going to compare it to the training time of LLM's.
At age 30 the total energy use of the brain sums up to about 5000 Wh, which is 1440 times more efficient.
But at age 30 we didn't learn good representations for most of the stuff on the internet so one could argue that given the knowledge learned, LLMs outperform the brain on energy consumption.
That said, LLM's have it easier as they are already learning from an abstract layer (language) that already has a lot of good representations while humans have to first learn to parse this through imagery.
Half the human brain is dedicated to processing imagery, so one could argue the human brain only spend 2500 Wh on equivalent tasks which makes it 3000x more efficient.
Liked the article though, didn't know about HNSW's.
Edit: made some quick comparisons for inference
Assuming a human spends 20 minutes answering in a well-thought out fashion.
Human watt-hours: 0.00646
GPT-4 watt-hours (openAI data): 0.833
That makes our brains still 128x more energy efficient but people spend a lot more time to generate the answer.
Edit: numbers are off by 1000 as I used calories instead of kilocalories to calculate brain energy expense.
Corrected:
human brains are 1.44x more efficient during training and 0.128x (or 8x less efficient) during inference.
LLMs are always an additional cost, never more efficient because they add to the calculation, if you look at it that way.
ChatGPT has to deal with the languages we already created, it doesn't get to co-adapt.
I don't think this is true personally, ideally as children, we spend out time having fun and learning about the world is a side effect. This borg like thinking applied to intelligence because we have LLMs is unusual to me.
I learned surfing through play and enjoyment, not through training like a robot.
We can train for something with intention, but I think that is mostly a waste of energy, albeit necessary on occasion.
What do you think "play" is? Animals play to learn about themselves and the world, you see most intelligent animals play as kids with the play being a simplification of what they do as adults. Human kids similarly play fight, play build things, play cook food, play take care of babies etc, it is all to make you ready for an adult life.
Playing is fun since playing helps us learn, otherwise we wouldn't evolve to play, we would evolve to be like ants that just work all day long if that was more efficient. So the humans who played around beat those who worked their ass off, otherwise we would all be hard workers.
I think the part of this that resonates as most true to me is how this reframes learning in a way that tracks truth more closely. It's not all the time, 100% of the time, it's in fits and starts, its opportunistic, and there are long intervals that are not active learning.
But the big part where I would phrase things differently is in the insistence that play in and of itself is not a form of learning. It certainly is, or certainly can be, and while you're right that it's something other than Borg-like accumulation I think there's still learning happening there.
We don't know how to fully operate a human brain when it's fully disconnected from eyes, a mouth, limbs, ears and a human heart.
That doesn't sound right... 30 years * 20 Watts = 1.9E10 Joules = 5300 kWh.
My number is based on calorie usage
Humans who spend a long time doing inference have not fully learned the thing being inferred - unlike LLMs, when we are undertrained, rather than a huge spike in error rate, we go slower.
When humans are well trained, human inference absolutely destroys LLMs.
This isn't an apt comparison. You are comparing a human trained in a specific field to an LLM trained on everything. When an LLM is trained with a narrow focus as well, human brain cannot compete. See Garry Kasparov vs Deep Blue. And Deep Blue is very old tech.
I suppose they intended that as a back-of-the-envelope starting point rather than a strict claim however. But even so, gotta be accountable to your starting assumptions, and I think a lot changes when this one is reconsidered.
We probably need to exclude the cerebellum as well (which is 50% of the neurons in the brain) as it’s used for error correction in movement.
Realistically you probably just need a few parts of the lambic system. Hippocampus, amygdala, and a few of the deep brain dopamine centers.
Yes we have learnt far more complex stuff, ffs.
i.e. not many humans invent calculus or relativity from scratch.
I think OP's point stands - these comparisons end up being overly hand-wavey and very dependent on your assumptions and view.
So yeah, you do use 2000 calories a day, but unless you live in an isolated jungle tribe, vast amounts of energy are consumed on delivering you food, climate control, electricity, water, education, protection, entertainment and so on.
I've come to the conclusion that gpt and gemini and all the others are nothing but conversational search engines. They can give me ideas or point me in the right direction but so do regular search engines.
I like the conversation ability but, in the end, I cannot trust their results and still have to research further to decide for myself if their results are valid.
I just go into the notebook tab (with an empty textarea) and start writing about a topic I’m interested in, then hit generate. It’s not a conversation, just an article in a passive form. The “chat” is just a protocol of in a form of an article with a system prompt at the top and “AI: …\nUser: …\n” afterwards, all wrapped into a chat ui.
While the article is interesting, I just read it (it generates forever). When it goes sideways, I stop it and modify the text in a way that fits my needs, in a recent place or maybe earlier, and then hit generate again.
I find this mode superior to complaining to a bot, since wrong info/direction doesn’t spoil the content. Also you don’t have to wait or interrupt, it’s just a single coherent flow that you can edit when necessary. Sometimes I stop it at “it’s important to remember …” and replace it with a short disclaimer like “We talked about safety already. Anyway, back to <topic>” and hit generate.
Fundamentally, LLMs generate texts, not conversations. Conversations just happen to be texts. It’s something people forget / aren’t aware of behind these stupid chat interfaces.
Reminds me of a similar argument about correctly pricing renewable power: since it isnt always-on (etc.) it requires a variety of alternative systems to augment it which aren't priced in. Ie., converting entirely to renewables isnt possible at the advertised price.
In this sense, we cannot "convert entirely to LLMs" for our tasks, since there's still vast amounts of labour in prompt/verify/use/etc.
Another thing a search engine cannot do that I use ChatGPT for on a daily basis is taking unstructured text and convert it into a specified JSON format.
I can do the opposite.
I can click the first result 1 billion times faster.
At this point it's just wasting people's times.
It’s exactly that for me, a conversational search engine. And the article explains it right, it’s just words organized in very specific ways to be able to retrieve them with statistical accuracy and the transformer is the cherry on top to make it coherent
You have a rough mathematical approximation of what's already a famously unreliable system. Expecting complete accuracy instead of about-rightness from it seems mad to me. And there are tons of applications where that's fine, otherwise our civilization wouldn't be here today at all.
And then you tell it such an API/case/etc doesn't exist. And it'll immediately acknowledge its mistake, and ensure it will work to avoid such in the future. And then literally the next sentence in the conversation it's back to inventing the same nonsense again. This is not like a human because even with the most idiotic human there's an at least general trend to move forward - LLMs are just coasting back on forth based on their preexisting training with absolutely zero ability to move forward until somebody gives them a training set to coast back and forth on, and repeat.
If a computer does not understand words, neither does your brain. While electromagnetic charge in the brain does not at all correspond with electromagnetic charge in a GPU, they do share an abstraction level, unlike words vs bits.
Computers right now do not understand language, but that does not mean that they cannot. We don't know what it takes to bridge the gap from stochastic parrot to understanding in computers, however from the mistakes LLMs make right now, it appears we have not found it yet.
It is possible that silicon based computer architecture cannot support the processing and information storage density/latency to support understanding. It's hard to guage the likelihood this is true given how little we know about how understanding works in the brain.
Each neurone is itself a complex combination of chemicals cycles; these can be, and have been, simulated.
The most complex chemicals in biology are proteins; these can be directly simulated with great difficulty, and we've now got AI that have learned to predict them much faster than the direct simulations on a classical computer ever could.
Those direct simulations are based on quantum mechanics, or at least computationally tractable approximations of it; QM is lots of linear algebra and either a random number generator or superdeterminism, either of which is still a thing a computer can do (even if the former requires a connection to a quantum-random source).
The open question is not "can computers think?", but rather "how detailed does the simulation have to be in order for it to think?"
Our brains are the product of the same dumb evolutionary process that made every other plant and animal and fungus and virus. We evolved from animals capable of only the most basic form of pattern recognition. Humans in the absence of education are not capable of even the most basic reasoning. It took us untold thousands of years to figure out that “try things and measure if it works” is a good way to learn about the world. An intelligent species would be able to figure things out by itself our ancestors, who have the same brain architecture we do, were not able to figure anything out for generation after generation. So much for our ability to do original independent thinking.
It's a combination of what you have already seen, read about or heard of, isn't it?
For articles like this to be interesting, this can not be accepted as an axiom. It's justification is what's interesting,
> If you believe LLM have qualia, you also believe a ...
You use the word believe twice here. I am actively not talking about beliefs.
I just realise, that the author indeed gave themselves an out:
> ... currently computers do not really understand words.
The author might believe that future computers can understand words. This is interesting. Questions being _what_ needs to be in order for them to understand? Could that be an emergent feature of current architectures? That would also contradict large parts of the article.
While practice, axioms are often statements that we all agree on and accept as true, that isn't necessarily true and isn't the core of it's meaning.
Axioms are something we postulate as true, without providing an argument for its truth, for the purposes of making an argument.
In this case, the assertion isn't really used as part of a argument, but to bootstrap an explanation of how words are represented in LLMs.
Edit: I find this so amusing because it is an example of learning a word without understanding it.
Uhm… no?
They are literally things that can't be proven but allow us to prove a lot of other things.
Yes, we attach meaning to certain words based on previous experience, but we do so in the context of a conscious awareness of the world around us and our experiences within it. An LLm doesn't even have a notion of self, much less a mechanism for attaching meaning to words and phrases based on conscious reasoning.
Computers can imitate understanding "pretty well" but they have nothing resembling a pretty good or bad or any kind of notion of comprehension about what they're saying.
You have kids talking to this thing asking it to teach them stuff without knowing that it doesn't understand shit! "How did you become a doctor?" "I was scammed. I asked ChatGPT to teach me how to make a doctor pepper at home and based on simple keyword matching it got me into medical school (based on the word doctor) and when I protested that I just want to make a doctor pepper it taught me how to make salsa (based on the word pepper)! Next thing you know I'm in medical school and it's answering all my organic chemistry questions, my grades are good, the salsa is delicious but dammit I still can't make my own doctor pepper. This thing is useless!
/s
If LLMs were capable of understanding, they wouldn't be so easy to trick on novel problems.
Firstly, do understand that I am not saying that LLMs (or ChatGPT) do understand.
I am merely saying that we don't have any sound frameworks to assess it.
For the rest of your rant: I definitely see that you don't derive any value from ChatGPT. As such I really hope you are not paying for it - or wasting your time on it. What other people decide to spend their money on is really their business. I don't think any normal functioning people have the expectation that a real person is answering them when they use ChatGPT - as such it is hardly a fraud.
Deleted Comment
But it rather seems a good general introduction into the realm aimed at beginners. Not sure if it gets everything right and the author clearly states he is not an expert and would like correction where he is wrong, but it seems worth checking out, if one is interested in understanding a bit about the magic behind it.
Clickholes get too many votes.
To paraphrase, I will not excuse such a long letter, for you had more time to write a shorter one.