> defining AGI as matching the cognitive versatility and proficiency of a well-educated adult
I don't think people really realize how extraordinary accomplishment it would be to have an artificial system matching the cognitive versatility and proficiency of an uneducated child, much less a well-educated adult. Hell, AI matching the intelligence of some nonhuman animals would be an epoch-defining accomplishment.
I think the bigger issue is people confusing impressive but comparatively simpler achievements (everything current LLMs do) with anything remotely near the cognitive versatility of any human.
But the big crisis right now is that for an astonishing number of tasks that a normal person could come up with, chatgpt.com is actually a good at or better than a typical human.
If you took the current state of affairs back to the 90s you’d quickly convince most people that we’re there. Given that we’re actually not, we’re now have to come up with new goalposts.
Was thinking about this today. I had to do a simple wedding planning task - setting up my wedding website with FAQ, cobbling the guest list (together from texts, photos of my father’s address book, and excel spreadsheets), directions and advice for lodging, conjuring up a scheme to get people to use the on-site cabins, and a few other mundane tasks. No phone calls, no “deep research” just wrote browser-jockeying. Not even any code, the off-the-rack system just makes that for you (however I know for a fact an LLM would love to try to code this for me).
I know without a single doubt that I could not simply as an “AI” “agent” to do this today and expect any sort of a functional result, especially when some of these were (very simple) judgement calls or workarounds for absolutely filthy data and a janky wedding planning website UI.
The tests for AGI that keep getting made, including the ones in this paper, always feel like they're (probably unintentionally) constructed in a way that covers up AI's lack of cognitive versatility. AI functions much better when you do something like you see here, where you break down tasks into small restricted benchmarks and then see if they can perform well.
But when we say AGI, we want something that will function in the real world like a human would. We want to be able to say, "Here's 500 dollars. Take the car to get the materials, then build me a doghouse, then train my dog. Then go to the store, get the ingredients, and make dinner."
If the robotics aren't reliable enough to test that, then have it be a remote employee for 6 months. Not "have someone call up AI to wrote sections of code" - have a group of remote employees, make 10% AI, give them all the same jobs with the same responsibilities, and see if anyone notices a difference after 6 months. Give an AI an account on Upwork, and tell it to make money any way it can.
Of course, AI is nowhere near that level yet. So we're stuck manufacturing toy "AGI" benchmarks that current AI can at least have some success with. But these types of benchmarks only broadcast the fact that we know that current and near future AI would fail horribly at any actual AGI task we threw at it.
Or even to come up with a definition of cognitive versatility and proficiency that is good enough to not get argued away once we have an AI which technically passes that specific definition.
The Turing Test was great until something that passed it (with an average human as interrogator) turned out to also not be able to count letters in a word — because only a special kind of human interrogator (the "scientist or QA" kind) could even think to ask that kind of question.
Can you point to an LLM passing the turing test where they didn't invalidate the test by limiting the time or the topics?
I've seen claims of passing but it's always things like "with only 3 questions" or "with only 3 minutes of interrogation" or "With only questions about topic X". Those aren't Turing Tests. As an example, if you limit the test to short things than anything will pass "Limit to 1 word one question". User types "Hello", LLM response "Hi". PASS! (not!)
Note that Turing test allows a lot leeway for the test settings, i.e. who interrogates it, how much they know about the weakness of current SOA models, are they allowed to use tools (I'm thinking of something like ARC-AGI but in a format that allows chat-based testing), and how long a chat is allowed etc. Therefore there can be multiple interpretations of whether the current models pass the test or not.
One could say that if there is maximally hard Turing test, and a "sloppy" Turing test, we are somewhere where the current models pass the sloppy version but not the maximally hard version.
I think the turing test suffers a bit from the "when a measurement becomes a target, it ceases to be a good measurement."
An AI that happened to be able to pass the turing test would be pretty notable because it probably implies much more capabilities behind the scenes. The problem with, for example, LLMs, they're essentially optimized turing test takers. That's about all they can do.
Plus, I don't think any LLM will pass the turing test in the long term. Once something organically comes up that they aren't good at, it'll be fairly obvious they aren't human and the limits of context will also become apparent eventually.
The Turing test is long outdated. Modern models can fool humans, but fooling isn’t understanding. Maybe we should flip the perspective AGI isn’t about imitation, it’s about discovering patterns autonomously in open environments.
If a human learned only on tokenized representations of words I don't know that they would be as good at inferring the numbers of letters in the words in teh underlying tokens as llms.
Or that this system would fail to adapt in anyway to changes of circumstance. The adaptive intelligence of a live human is truly incredible. Even in cases where the weights are updatable, We watch AI make the same mistake thousands of times in an RL loop before attempting a different strategy.
Absolute definitions are weak. They won't settle anything.
We know what we need right now, the next step. That step is a machine that, when it fails, it fails in a human way.
Humans also make mistakes, and hallucinate. But we do it as humans. When a human fails, you think "damn, that's a mistake perhaps me or my friend could have done".
LLMs on the other hand, fail in a weird way. When they hallucinate, they demonstrate how non-human they are.
It has nothing to do with some special kind of interrogator. We must assume the best human interrogator possible. This next step I described work even with the most skeptic human interrogator possible. It also synergizes with the idea of alignment in ways other tests don't.
When that step is reached, humans will or will not figure out another characteristic that makes it evident that "subject X" is a machine and not a human, and a way to test it.
Moving the goalpost is the only way forward. Not all goalpost moves are valid, but the valid next move is a goalpost move. It's kind of obvious.
People are specialists not generalists, creating a AI that is generalist and claiming it to have cognitive abilities the same as an "well-educated" adult is an oxymoron. And if such system could ever be made My guess is it wont be more than a few (under 5) Billion Parameter model that is very good at looking up stuff online, forgetting stuff when not in use , planning and creating or expanding the knowledge in its nodes. Much like a human adult would. It will be highly sa mple efficient, It wont know 30 languages (although it has been seen that models generalize better with more languages), it wont know entire wikipedia by heart , it even wont remember minor details of programming languages and stuff. Now that is my definition of an AGI.
For me, it would be because the term AGI gets bandied about a lot more frequently in discussions involving Gen AI, as if that path takes us any closer to AGI than other threads in the AI field have.
Have any benchmarks been made that use this paper’s definition? I follow the ARC prize and Humanity’s Last Exam, but I don’t know how closely they would map to this paper’s methods.
Edit: Probably not, since it was published less than a week ago :-) I’ll be watching for benchmarks.
I always laugh these, why are people always jumping to defining AGI when they clearly don't have a functional definition for the I part yet? More to the point, once you have the I part you get the G part, it is a fundamental part of it.
I’m more surprised and equally concerned that the majority of people’s understanding of intelligence and their definition of AGI. Not only does the definition “… matching the cognitive versatility and proficiency of a well-educated adult.”, by definition violate the “general” in AGI, by the “well educated” part; but it also implies that only the “well-educated” (presumably by a specific curriculum) qualifies one as intelligent and by definition also once you depart from the “well” of the “educated” you exponentially diverge from “intelligent”. It all seems rather unimpressive intelligence.
In other words; in one question; is the current AI not already well beyond the “…cognitive versatility and proficiency of an uneducated child”? And when you consider that in many places like Africa, they didn’t even have a written language until European evangelists created it and taught it to them in the late 19th century, and they have far less “education” than even some of the most “uneducated” avg., European and even many American children, does that not mean that AI is well beyond them at least?
Frankly, as it seems things are going, there Is at the very least going to be a very stark shift in “intelligence” that even exceeds that which has happened in the last 50 or so years that have brought us stark drops in memory, literary knowledge, mathematics, and even general literacy, not to mention the ability to write. What does it mean that kids now will not even have to feign acting like they’re selling out sources, vetting them, contradicting a story or logical sequence, forming ideas, messages, and stories, etc.? I’m not trying to be bleak, but I don’t see tons simply resulting in net positive outcomes, and most of the negative impacts will also be happening below the surface to the point that people won’t realize what is being lost.
What I think is being skipped in the current conversation is that versatility keyword is hiding a lot of unknowns - even now. We don't seem to have a true understanding of the breadth or depth of our own unconscious thought processes, therefore we don't have much that is concrete to start with.
I'll simultaneously call all current ML models "stupid" and also say that SOTA LLMs can operate at junior (software) engineer level.
This is because I use "stupidity" as the number of examples some intelligence needs in order to learn from, while performance is limited to the quality of the output.
LLMs *partially* make up for being too stupid to live (literally: no living thing could survive if it needed so many examples) by going through each example faster than any living thing ever could — by as many orders of magnitude as there are between jogging and continental drift.
AI is highly educated. It's a different sort of artifact we're dealing with where it can't tell truth from fiction.
What's going on is AI fatigue. We see it everywhere, we use it all the time. It's becoming generic and annoying and we're getting bored of it EVEN though the accomplishment is through the fucking roof.
If elon musk makes interstellar car that can reach the nearest star in 1 second and priced it at 1k, I guarantee within a year people will be bored of it and finding some angle to criticize it.
So what happens is we get fatigued, and then we have such negative emotions about it that we can't possibly classify it as the same thing as human intelligence. We magnify the flaws and until it takes up all the space and we demand a redefinition of what agi is because it doesn't "feel" right.
We already had a definition of AGI. We hit it. We moved the goal posts because we weren't satisfied. This cycle is endless. The definition of AGI will always be changing.
Take LLMs as they exist now and only allow 10% of the population to access it. Then the opposite effect will happen. The good parts will be over magnified and the bad parts will be acknowledged and then subsequently dismissed.
Think about it. All the AI slop we see on social media are freaking masterpieces works of art produced in minutes what most humans can't even hope to come close to. Yet we're annoyed and unimpressed by them. That's how it's always going to go down.
Pretty much. Capabilities we now consider mundane were science fiction just three years ago, as far as anyone not employed by OpenAI was concerned.
We already had a definition of AGI. We hit it.
Are you sure about that? Which definition are you referring to? From what I can tell with Google and Grok, every proposed definition has been that AGI strictly matches or exceeds human cognitive capabilities across the board.
Generative AI is great, but it's not like you could just assign an arbitrary job to a present-day LLM, give it access to an expense account, and check in quarterly with reasonable expectations of useful progress.
I'm curious when and what you consider to have been the moment.
To me, the general in AGI means I should be able to teach it something it's never seen before. I don't think I can even teach an LLM something it's seen a million times before. Long division, for example.
I don't think a model that is solid state until it's "trained" again has a very good chance of being AGI (unless that training is built into it and the model can decide to train itself).
I'm not an expert, but my layman's understanding of AI was that AGI meant the ability to learn in an abstract way.
Give me a dumb robot that can learn and I should be able to teach it how to drive, argue in court, write poetry, pull weeds in a field, or fold laundry the same way I could teach a person to do those things.
(1) AI isn't educated. It has access to a lot of information. That's two different things.
(2) I was rebutting the paper's standard that AGI should be achieving the status of a well-educated adult, which is probably far, far too high a standard. Even something measured to a much lower standard--which we aren't at yet--would change the world. Or, going back to my example, an AI that was as intelligent as a labrador in terms of its ability to synthesize and act on information would be truly extraordinary.
> EVEN though the accomplishment is through the fucking roof.
I agree with this but also, the output is almost entirely worthless if you can’t vet it with your own knowledge and experience because it routinely gives you large swaths of incorrect info. Enough that you can’t really use the output unless you can find the inevitable issues. If I had to put a number to it, I would say 30% of what an LLM spits out at any given time to me is completely bullshit or at best irrelevant. 70% is very impressive, but still, it presents major issues. That’s not boredom, that’s just acknowledging the limitations.
It’s like designing an engine or power source that has incredible efficiency but doesn’t actually move or affect anything (not saying LLM’s are worthless but bear with me). It just outputs with no productive result. I can be impressed with the achievement while also acknowledging it has severe limitations
> If elon musk makes interstellar car that can reach the nearest star in 1 second and priced it at 1k, I guarantee within a year people will be bored of it and finding some angle to criticize it.
Americans were glued to their seats watching Apollo 11 land. Most were back to watching I Dream of Jeanie reruns when Apollo 17 touched down.
The problem, I guess, with these methods is, they consider human intelligence as something detached from human biology. I think this is incorrect. Everything that goes in the human mind is firmly rooted in the biological state of that human, and the biological cycles that evolved through millennia.
Things like chess-playing skill of a machine could be bench-marked against that of a human, but the abstract feelings that drive reasoning and correlations inside a human mind are more biological than logical.
Yup, I feel like the biggest limitation with current AI is that they don't have desire (nor actual agency to act upon it). They don't have to worry about hunger, death, feelings, and so they don't really have desires to further explore space, or make life more efficient because they're on limited time like humans. Their improvement isn't coming inside out like humans, it's just external driven (someone pressing a training epoch). This is why I don't think LLMs will reach AGI, if AGI somehow ties back to "human-ness." And maybe that's a good thing for Skynet reasons, but anyways
There's equally no reason to believe that a machine can be conscious. The fact is, we can't say anything about what is required for consciousness because we don't understand what it is or how to measure or define it.
There is exactly one good reason, at least for consciousness and sentience. And the reason is that those are such a vaguely defined (or rather defined by prototypes; ala Wittgenstein [or JavaScript before classes]). And that reason is anthropism.
We only have one good example of consciousness and sentience, and that is our own. We have good reason to suspect other entities (particularly other human individuals, but also other animals) have that as well, but we cannot access it, and not even confirm its existence. As a result using these terms of non-human beings becomes confusing at best, but it will never be actually helpful.
Emotions are another thing, we can define that outside of our experience, using behavior states and its connection with patterns of stimuli. For that we can certainly observe and describe behavior of a non biological entity as emotional. But given that emotion is something which regulates behavior which has evolved over millions of years, whether such a description would be useful is a whole another matter. I would be inclined to use a more general description of behavior patterns which includes emotion but also other means of behavior regulators.
they do not, but the same argument can hold true by the fact the true human nature is not really known and thus trying to define what a human like intelligence would consist of can only be incomplete.
there are many parts of human cognition, phycology etc. especially related to consciousness that are known unknowns and/or completely unknown.
a mitigation for this isaue would be to call it generally applicable intelligence or something, rather than human like intelligence. implying ita not specialized AI but also not human like. (i dont see why it would need to be human like, because even with all the right logic and intelligence a human can still do something counter to all of that. humans do this everyday. intuitive action, or irrational action etc.
what we want is generally applicable intelligence, not human like intelligence.
What if our definition of those concepts is biological to begin with?
How does a computer with full AGI experience the feeling of butterflies in your stomach when your first love is required?
How does a computer experience the tightening of your chest when you have a panic attack?
How does a computer experience the effects of chemicals like adrenaline or dopamine?
The A in AGI stands for “artificial” for good reason, IMO. A computer system can understand these concepts by description or recognize some of them them by computer vision, audio, or other sensors, but it seems as though it will always lack sufficient biological context to experience true consciousness.
Perhaps humans are just biological computers, but the “biological” part could be the most important part of that equation.
That sounds correct though more fundamentally we don’t know what intelligence or consciousness are. It’s almost a religious question, as in our current understanding of the universe does not explain them but we know they exist. So regardless of embodied intelligence, we don’t even understand the basic building blocks of intelligence, we just have some descriptive study of it, that imo LLMs can get arbitrarily close to without ever being intelligent because if you can describe it, you can fit to it.
The current AI buildup is based on an almost metaphysical bet that intelligence can be simulated in software and straightforwardly scaled by increasing complexity and energy usage.
Personally, I remain skeptical that is the case.
What does seem likely is that “intelligence” will eventually be redefined to mean whatever we got out of the AI buildup.
What about aliens? When little green critters finally arrive on this planet, having travelled across space and time, will you reject their intelligence because they lack human biology? What if their biology is silicon based, rather than carbon?
There's really no reason to believe intelligence is tied to being human. Most of us accept the possibility (even the likelihood) of intelligent life in the universe, that isn't.
I think I need to point out some obvious issues with the paper.
Definition of artificial:
>Made by humans, especially in imitation of something natural.
>Not arising from natural or necessary causes; contrived or arbitrary.
Thus artificial intelligence must be the same as natural, the process of coming up with it doesn't have to be natural.
What this means: we need to consider the substrate that makes natural intelligence. They cannot be separated willy nilly without actual scientific proof. As in, we cannot imply a roll of cheese can manifest intelligence based on the fact that it recognizes how many fingers are in an image.
The problem arises from a potential conflict of interests between hardware manufacturer companies and definition of AGI. The way I understand it, human like intelligence cannot come from algorithms running on GPUs. It will come from some kind of neuromorphic hardware.
And the whole point of neuromorphic hardware is that it operates (closely) on human brain principles.
Thus, the definition of AGI MUST include some hardware limitations. Just because I can make a contraption "fool" the tests doesn't mean it has human like cognition/awareness. That must arise from the form, from the way the atoms are arranged in the human brain. Any separation must be scientifically proven. Like if anyone implies GPUs can generate human like self awareness that has to be somehow proven.
Lacking a logical way to prove it, the best course of action is to closely follow the way the human brain operates (at least SNN hardware).
>The resulting AGI scores (e.g., GPT-4 at 27%, GPT-5
at 57%) concretely quantify both rapid progress and the substantial gap remaining
before AGI.
This is nonsense. GPT scores cannot decide AGI level. They are the wrong algorithm running on the wrong hardware.
I have also seen no disclosure on conflict of interests in the paper.
After reading the paper I’m struck by the lack of any discussion of awareness. Cognition requires at its basis awareness, which due to its entirely non verbal and unconstructed basis, is profoundly difficult to describe, measure, quantify, or label. This makes it to my mind impossible to train a model to be aware, let alone for humans to concretely describe it or evaluate it. Philosophy, especially Buddhism, has tried for thousands of years and psychology has all but abandoned attempting so. Hence papers like this that define AGI on psychometric dimensions that have the advantage of being easily measured but the disadvantage of being incomplete. My father is an emeritus professor of psychometrics and he agrees this is the biggest hurdle to AGI - that our ability to measure the dimensions of intelligence is woefully insufficient to the task of replicating intelligence. We scratch the surface and his opinion is language is sufficient to capture the knowledge of man, but not the spark of awareness required to be intelligent.
This isn’t meant to be a mystical statement that it’s magic that makes humans intelligent or some exotic process impossible to compute. But that the nature of our mind is not observable in its entirety to us sufficient that the current learned reinforcement techniques can’t achieve it.
Try this exercise. Do not think and let your mind clear. Ideas will surface. By what process did they surface? Or clear your mind entirely then try to perform some complex task. You will be able to. How did you do this without thought? We’ve all had sudden insights without deliberation or thought. Where did these come from? By what process did you arrive at them? Most of the things we do or think are not deliberative and definitely not structured with language. This process is unobservable and not measurable, and the only way we have to do so is through imperfect verbalizations that hint out some vague outline of a subconscious mind. But without being able to train a model on that subconscious process, one that can’t be expressed in language with any meaningful sufficiency, how will language models demonstrate it? Their very nature of autoregressive inference prohibits such a process from emerging at any scale. We might very well be able to fake it to an extent that it fools us, but awareness isn’t there - and I’d assert that awareness is all you need.
Awareness is just continuous propagation of the neural network, be that artificial or biological. The reason thoughts just "appear" is because the brain is continuously propagating signal through the neural network. LLMs also do this during their decoding phase, where they reason continuously with every token that they generate. There is no difference here.
Then you say "we don't think most of the times using language exclusively" , but neither do LLMs. What most people fail to realise is that in between each token being generated, black magic is happening in between the transformer layers. The same type of magic you describe. High dimensional. Based on complex concepts. Merging of ideas. Fusion of vectors to form a combined concept. Smart compression. Application of abstract rules. An LLM does all of these things, and more, and you can prove this by how complex their output is. Or, you can read studies by Anthropic on interpretability, and how LLMs do math underneath the transformer layers. How they manipulate information.
AGI is not here with LLMs, but its not because they lack reasoning ability. It's due to something different. Here is what I think is truly missing: continuous learning, long term memory, and infinite and efficient context/operation. All of these are tied together deeply, and thus I believe we are but a simple breakthrough away from AGI.
There are very significant differences between biological and artificial neural networks. Artificial neural networks are mathematical attempts to replicating how the brain’s neurons work. They are not and were never meant to be 1 to 1 replications. There is the difference in scale, where the “parameters” of human neural networks absolutely dwarf the current LLMs we have today. There is also the fact that they are materially different. The underlying biology and cell structure affects biological neural networks in ways that artificial neural networks just simply dont have access to.
The idea of awareness being propagations through the NN is an interesting concept though. I wonder if this idea be proven through monitoring the electrical signals within the brain.
> Awareness is just continuous propagation of the neural network, be that artificial or biological. The reason thoughts just "appear" is because the brain is continuously propagating signal through the neural network.
This is just a claim you are making, without evidence.
The way you understand awareness is not through "this is like that" comparisons. These comparisons fall over almost immediately as soon as you turn your attention to the mind itself, by observing it for any length of time. Try it. Go observe your mind in silence for months. You will observe for yourself it is not what you've declared it to be.
> An LLM does all of these things, and more, and you can prove this by how complex their output is.
Complex output does not prove anything. You are again just making claims.
It is astoundingly easy to push an LLM over to collapse into ungrounded nonsense. Humans don't function this way because the two modes of reasoning are not alike. It's up to those making extraordinary claims to prove otherwise. As it is, the evidence does not exist that they behave comparably.
> What most people fail to realise is that in between each token being generated, black magic is happening in between the transformer layers.
Thank you by saying that. I think most people have an incomplete mental model for how LLMs work. And it's very misleading for understanding what they really do and can achieve. "Next token prediction" is done only at the output layer. It's not what really happens internally. The secret sauce is at the hidden layers of a very deep neural network. There are no words or tokens inside the network. A transformer is not the simple token estimator that most people imagine.
I so completely agree. In virtually every conversation I have heard about AI, it only every talks about one of the multiple intelligences as theorized in Howard Gardner's book Frames of Mind: The Theory of Multiple Intelligences (1983)[1]
There is little discussion of how AI will enhance (or destroy) our emotional intelligence, or our naturalistic, intrapersonal or interpersonal intelligences.
Most religions, spiritual practices and even forms of meditation highlight the value of transcending mind and having awareness be present in the body. The way AGI is described, it would seem transcendence may be treated as a malfunction or bug.
There is no way to measure awareness. We can only know we are aware ourselves. For all we know trees or rocks might have awareness. Or I could be the only being aware of itself in the universe. We have no way to prove anything about it. Therefore it is not a useful descriptor of intelligence (be it human, animal or artificial).
Agreed. Everything that looks like intelligence to ME is intelligent.
My measurement of outside intelligence is limited by my intelligence. So I can understand when something is stupider compared to me. For example, industrial machine vs human worker, human worker is infinitely more intelligent compared to machine, because this human worker can do all kinds of interesting stuff. this metaphorical "human worker" did everything around from laying a brick to launching a man to the Moon.
....
Imagine Super-future, where humanity created nanobots and they ate everything around. And now instead of Earth there is just a cloud of them.
These nanonobots were clever and could adapt, and they had all the knowledge that humans had and even more(as they were eating earth a swarm was running global science experiments to understand as much as possible before the energy ends).
Once they ate the last bite of our Earth(an important note here: they left an optimal amount of matter to keep running experiments. Humans were kept in a controlled state and were studied to increase Swarm's intelligence), they launched next stage. A project, grand architect named "Optimise Energy capture from the Sun".
Nanobots re-created the most efficient ways of capturing the Sun energy - ancient plants, which swarm studied for centuries. Swarm added some upgrades on top of what nature came up with, but it was still built on top of what nature figured by itself. A perfect plant to capture the Sun's energy. All of them a perfect copy of itself + adaptive movements based on their geolocation and time(which makes all of them unique).
For plants nanobots needed water, so they created efficient oceans to feed the plants. They added clouds and rains as transport mechanism between oceans and plants... etc etc.
One night the human, which you already know by the name "Ivan the Liberator"(back then everyone called him just Ivan), didn't sleep on his usual hour. Suddenly all the lights went off and he saw a spark on the horizon. Horizon, that was strongly prohibited to approach. He took his rifle, jumped on a truck and raced to the shore - closest point to the spark vector.
Once he approached - there was no horizon or water. A wall of dark glass-like material, edges barely noticeable. Just 30 cm wide. On the left and on the right from a 30 cm wide wall - an image as real as his hands - of a water and sky. At the top of the wall - a hole. He used his gun to hit the wall with the light - and it wasn't very thick, but once he hit - it regenerated very quickly. But once he hit a black wall - it shattered and he saw a different world - world of plants.
He stepped into the forest, but these plants, were behaving differently. This part of the swarm wasn't supposed to face the human, so these nanobots never saw one and didn't have optimised instructions on what to do in that case. They started reporting new values back to the main computer and performing default behaviour until the updated software arrived from an intelligence center of the Swarm.
A human was observing a strange thing - plants were smoothly flowing around him to keep a safe distance, like water steps away from your hands in a pond.
"That's different" thought Ivan, extended his hand in a friendly gesture and said
- Nice to meet you. I'm Ivan.
....
In this story a human sees a forest with plants and has no clue that it is a swarm of intelligence far greater than him. To him it looks repetitive simple action that doesn't look random -> let's test how intelligent outside entity is -> If entity wants to show its intelligence - it answers to communication -> If entity wants to hide its intelligence - it pretends to be not intelligent.
If Swarm decides to show you that it is intelligent - it can show you that it is intelligent up to your level. It won't be able to explain everything that it knows or understands to you, because you will be limited by your hardware. The limit for the Swarm is only computation power it can get.
We don't want awareness because it begets individuals by means of agency and we'd need to give them rights. This is industry's nightmare scenario.
People want autonomy, self-learning, consistent memory and perhaps individuality (in the discernability/quirkiness sense), but still morally unencumbered slaves.
Any definition of AGI that doesn't include awareness is wrongly co-opting the term, in my opinion. I do think some people are doing that, on purpose. That way they can get people who are passionate about actual-AGI to jump on board on working with/for unaware-AGI.
Because LLMs don't have this special quality that you call "awareness", then they cannot have "cognition", neither of which you defined? This is along the lines of "There must be something special with my mind that LLMs don't have, I can just feel it" special pleading whether you call it awareness, consciousness, qualia etc.
As long as you cannot define it clearly or even show that you yourself have this quality, I think the burden of proof is on you to show why this has any real world implications rather than just being word play. We can build thinking, reasoning machines just fine without waiting for philosophers to finally answer what consciousness is.
> Try this exercise. Do not think and let your mind clear. Ideas will surface. By what process did they surface? Or clear your mind entirely then try to perform some complex task.
I do not have any even remotely practical definition for this, but this has to somehow involve the system being in a closed loop. It has to "run" in a sense that an operating system runs. Even if there is nothing coming on certain inputs it still has to run. And probably hallucinate (hehe) like humans do in an absence of a signal or infer patterns where there are none, yet be able to self-reflect that it is in fact a hallucination
This seems like an unsupported assertion. LLMs already exhibit good functional understanding of and ability in many domains, and so it's not at all clear that they require any more "awareness" (are you referring to consciousness?) than they already have.
> the spark of awareness required to be intelligent.
Again, this seems like an assumption - that there's some quality of awareness (again, consciousness?), that LLMs don't have, that they need in order to be "intelligent". But why do you believe that?
> We’ve all had sudden insights without deliberation or thought.
Highly doubtful. What you mean is, "without conscious thought". Your conscious awareness of your cognition is not the entirety of your cognition. It's worth reading a bit of Dennett's work about this - he's good at pointing out the biases we tend to have about these kinds of issues.
> We might very well be able to fake it to an extent that it fools us
This leads to claiming that there are unobservable, undetectable differences. Which there may be - we might succeed in building LLMs that meet whatever the prevailing arbitrary definition of intelligence is, but that don't possess consciousness. At that point, though, how meaningful is it to say they're not intelligent because they're not conscious? They would be functionally intelligent. Arguably, they already are, in many significant ways.
Anything that is not measurable (i.e. awareness, consciousness) is not very useful in practice as a metric. I don't think there is even an agreed definition what consciousness is, partially because it is not really observable outside of our own mind.
Therefore I think it makes perfect sense that awareness is not discussed in the paper.
Consciousness is observable in others! Our communication and empathy and indeed language depend on the awareness that others share our perceived reality but not our mind. As gp says, this is hard to describe or quantify, but that doesn't mean it's not a necessary trait for general intelligence.
> Try this exercise. Do not think and let your mind clear. Ideas will surface. By what process did they surface? Or clear your mind entirely then try to perform some complex task. You will be able to. How did you do this without thought? We’ve all had sudden insights without deliberation or thought. Where did these come from? By what process did you arrive at them? Most of the things we do or think are not deliberative and definitely not structured with language.
Not to pile on, but isn't this actually a distinct example of _lack_ of awareness? As in, our brains have sparks of creativity without understanding the inception of those sparks?
Perhaps I'm conflating some definition of "aware" with another definition of "awareness"?
Language is one of communication contracts. LLModels leverage these contracts to communicate data structures (shapes) that emerge when evaluating input. They are so good at prediction that when you give them a clue of a shape they will create something passable, and they keep getting better with training.
I hear there's work being done on getting the world models out, distilling the 'cortex-core' (aka the thinking without data), to perhaps see if they're capable of more, but so far we're looking at holograms of wishful thinking that increase in resolution, but still lack any essence.
This begs a question - can true intelligence even be artificial?
I'd argue the biggest issue with concretely defining intelligence is that any attempts end up falling in two buckets:
1. "Too" Broad, which raises uncomfortable questions about non-human intelligence and how we as humans treat them (see: whales, elephants, octopuses/cephalopods)
2. Too narrow, which again raises very uncomfortable issues about who and who does not qualify as human, and what we do with them.
Put in other words, it's more an issue of ethics and morals than it is definitional.
Awareness doesn't seem that hard for AI systems though - if you look at the screen on a self driving Tesla you can see if it's aware of pedestrians, cyclists etc. because it draws boxes around them on the screen as it becomes aware of them.
I guess by 'AGI' most people mean human level or above so I guess you'd want human level awareness which Teslas and the like don't have yet.
Can't "awareness" in both examples be approximated by a random seed generator? Both the human mind and autoregressive model just need any initial thought to iterate and improve off of, influenced by unique design + experienced priors.
Yep, computers execute code, they are tools. Humans have the capacity to spontaneously generate new thoughts out of nothing, solve problems never before solved and not just by brute force number crunching.
Does any of that argument really matter? And frankly, this statement:
>This makes it to my mind impossible to train a model to be aware
feels wrong. If you're arguing that human's are aware, then it is apparent that it is possible to train something to be aware. Nature doesn't have any formal definition of intelligence, or awareness, yet here we are.
From a practical perspective, it might be implausibly difficult to recreate that on computers, but theoretically, no reason why not.
Have we shown what the human brain does at a “hardware” level? Or are you just assuming that the basic building block of a computer is that same as the basic building block of a human brain?
> Does any of that argument really matter? And frankly, this statement.
My definition of a complete AGI is: an AI that can read JIRA tickets, talk with non-programmers and do all my job and get me and all/most software engineers fired and proves sustainable.
But in general, it's an AI that can do any remote-work just as good as humans.
agreed. There is no way to tell if someone is aware or not we rely on brain activity to say someone is alive or not there is no way to tell someone or something is conscious currently.
Does general intelligence require awareness though? I think you are talking about consciousness, not intelligence. Though to be frank consciousness and intelligence are not well defined terms either.
There’s already a vague definition that AGI is an AI with all the cognitive capabilities of a human. Yes, it’s vague - people differ.
This paper promises to fix "the lack of a concrete definition for Artificial General Intelligence", yet it still relies on the vague notion of a "well-educated adult". That’s especially peculiar, since in many fields AI is already beyond the level of an adult.
You might say this is about "jaggedness", because AI clearly lacks quite a few skills:
> Application of this framework reveals a highly “jagged” cognitive profile in contemporary models.
But all intelligence, of any sort, is "jagged" when measured against a different set of problems or environments.
So, if that’s the case, this isn’t really a framework for AGI; it’s a framework for measuring AI along a particular set of dimensions. A more honest title might be: "A Framework for Measuring the Jaggedness of AI Against the Cattell–Horn–Carroll Theory". It wouldn't be nearly as sexy, though.
Huh. I haven’t read the paper yet. But, it seems like a weird idea—wouldn’t the standard of “well educated (I assume, modern) adult” preclude the vast majority of humans who ever lived from being considered general intelligences?
And this is indeed a huge problem with a lot of the attacks on LLM even as more limited AI - a lot of them are based on applying arbitrary standards without even trying to benchmark against people, and without people being willing to discuss where they draw the line for stating that a given subset of people do not possess general intelligence...
I think people get really uncomfortable trying to even tackle that, and realistically for a huge set of AI tasks we need AI that are more intelligent than a huge subset of humans for it to be useful. But there are also a lot of tasks where AI that is not needed, and we "just" need "more human failure modes".
You can't measure intelligence directly. Instead, the idea is to measure performance in various tasks and use that as a proxy for intelligence. But human performance depends on other aspects beyond intelligence, including education, opportunities, and motivation, and most humans are far from reaching their true potential.
If you compare the performance of the average human to a state-of-the-art AI model trained by top experts with a big budget, you can't make any conclusions about intelligence. For the comparison to make sense, the human should also be trained as well as reasonably possible.
I read this as a hypothetical well-educated adult. As in, given the same level of knowledge, the intelligence performs equally well.
I do agree that it’s a weird standard though. Many of our AI implementations exceed the level of knowledge of a well-educated adult (and still underperform with that advantage in many contexts).
Personally, I don’t think defining AGI is particularly useful. It is just a marketing term. Rather, it’s more useful to just speak about features/capabilities. Shorthand for a specific set of capabilities will arise naturally.
>But all intelligence, of any sort, is "jagged" when measured against a different set of problems or environments.
On the other hand, research on "common intelligence" AFAIK shows that most measures of different types of intelligence have a very high correlation and some (apologies, I don't know the literature) have posited that we should think about some "general common intelligence" to understand this.
The surprising thing about AI so far is how much more jagged it is wrt to human intelligence
I think you are talking about correlation in humans of, say, verbal and mathematical intelligence. Still, it is a correlation, not equality - there are many word-acknowledged writers who suck at math, and mathematical prodigies who are are not the best at writing.
If you go beyond human species (and well, computers are not even living organisms), it gets tricky. Adaptability (which is arguably a broader concept than intelligence) is very different for, say octopodes, corvids and slime molds.
It is certainly not a single line of proficiency or progress. Things look like lines only if we zoom a lot.
Human intelligence has had hundreds of thousands of years of evolution that removes any 'fatal' variance from our intelligence. Too dumb is obvious on how it's culled, but 'too smart' can get culled by social creatures too, really 'too different' in any way.
Current AI is in its infancy and we're just throwing data at it in the same way evolution throws random change at our DNA and sees what sticks.
The fundamental premise of this paper seems flawed -- take a measure specifically designed for the nuances of how human performance on a benchmark correlates with intelligence in the real world, and then pretend as if it makes sense to judge a machine's intelligence on that same basis, when machines do best on these kinds of benchmarks in a way that falls apart when it comes to the messiness of the real world.
This paper, for example, uses the 'dual N-back test' as part of its evaluation. In humans this relates to variation in our ability to use working memory, which in humans relates to 'g'; but it seems pretty meaningless when applied to transformers -- because the task itself has nothing intrinsically to do with intelligence, and of course 'dual N-back' should be easy for transformers -- they should have complete recall over their large context window.
Human intelligence tests are designed to measure variation in human intelligence -- it's silly to take those same isolated benchmarks and pretend they mean the same thing when applied to machines. Obviously a machine doing well on an IQ test doesn't mean that it will be able to do what a high IQ person could do in the messy real world; it's a benchmark, and it's only a meaningful benchmark because in humans IQ measures are designed to correlate with long-term outcomes and abilities.
That is, in humans, performance on these isolated benchmarks is correlated with our ability to exist in the messy real-world, but for AI, that correlation doesn't exist -- because the tests weren't designed to measure 'intelligence' per se, but human intelligence in the context of human lives.
Don't get me wrong, I am super excited about what AI is doing for technology. But this endless conversation about "what is AGI" is so boring.
It makes me think of every single public discussion that's ever been had about quantum, where you can't start the conversation unless you go through a quick 101 on what a qubit is.
As with any technology, there's not really a destination. There is only the process of improvement. The only real definitive point is when a technology becomes obsolete, though it is still kept alive through a celebration of its nostalgia.
AI will continue to improve. More workflows will become automated. And from our perception, no matter what the rapidness of advancement is, we're still frogs in water.
I agree. It's an interesting discussion for those who have never taken college level philosophy classes I suppose. What consciousness/thought is is still a massively open question. Seeing people in the comments with what they think is their novel solution has already been posited like 400 years ago. Honestly it's kind of sad seeing this stuff on a forum like this. These posts are for sure the worst of Hackernews.
There are a bunch of these topics that everyone feels qualified to say something about. Consciousness, intelligence, education methods, nutrition, men vs women, economic systems etc.
It's a very emotional topic because people feel their self image threatened. It's a topic related to what is the meaning of being human. Yeah sure it should be a separate question, but emotionally it is connected to it in a deep level. The prospect of job replacement and social transformation is quite a threatening one.
So I'm somewhat understanding of this. It's not merely an academic topic, because these things will be adopted in the real world among real people. So you can't simply make everyone shut up who is an outsider or just heard about this stuff incidentally in the news and has superficial points to make.
> there's not really a destination. There is only the process of improvement
Surely you can appreciate that if the next stop on the journey of technology can take over the process of improvement itself that would make it an awfully notable stop? Maybe not "destination", but maybe worth the "endless conversation"?
I think it's not only the potential for self-improvement of AGI that is revolutionary. Even having an AGI that one could clone for a reasonable cost and have it work nonstop with its clones on any number of economically-valuable problems would be very revolutionary.
We have SAGI: Stupid Artificial General Intelligence. It's actually quite general, but works differently. In some areas it can be better or faster than a human, and in others it's more stupid.
Just like an airplane doesn't work exactly like a bird, but both can fly.
I find the concept of low floor/high ceiling quite helpful, as for instance recently discussed in "When Will AI Transform the Economy?" [1] - actually more helpful than "jagged" intelligence used in TFA.
Would propose to use the term Naive Artificial General Intelligence, in analogy to the widely used (by working mathematicians) and reasonably successful Naive Set Theory …
I was doing some naïve set theory the other day, and I found a proof of the Riemann hypothesis, by contradiction.
Assume the Riemann hypothesis is false. Then, consider the proposition "{a|a∉a}∈{a|a∉a}". By the law of the excluded middle, it suffices to consider each case separately. Assuming {a|a∉a}∈{a|a∉a}, we find {a|a∉a}∉{a|a∉a}, for a contradiction. Instead, assuming {a|a∉a}∉{a|a∉a}, we find {a|a∉a}∈{a|a∉a}, for a contradiction. Therefore, "the Riemann hypothesis is false" is false. By the law of the excluded middle, we have shown the Riemann hypothesis is true.
Naïve AGI is an apt analogy, in this regard, but I feel these systems aren't simple nor elegant enough to deserve the name naïve.
I don't think people really realize how extraordinary accomplishment it would be to have an artificial system matching the cognitive versatility and proficiency of an uneducated child, much less a well-educated adult. Hell, AI matching the intelligence of some nonhuman animals would be an epoch-defining accomplishment.
If you took the current state of affairs back to the 90s you’d quickly convince most people that we’re there. Given that we’re actually not, we’re now have to come up with new goalposts.
I know without a single doubt that I could not simply as an “AI” “agent” to do this today and expect any sort of a functional result, especially when some of these were (very simple) judgement calls or workarounds for absolutely filthy data and a janky wedding planning website UI.
But when we say AGI, we want something that will function in the real world like a human would. We want to be able to say, "Here's 500 dollars. Take the car to get the materials, then build me a doghouse, then train my dog. Then go to the store, get the ingredients, and make dinner."
If the robotics aren't reliable enough to test that, then have it be a remote employee for 6 months. Not "have someone call up AI to wrote sections of code" - have a group of remote employees, make 10% AI, give them all the same jobs with the same responsibilities, and see if anyone notices a difference after 6 months. Give an AI an account on Upwork, and tell it to make money any way it can.
Of course, AI is nowhere near that level yet. So we're stuck manufacturing toy "AGI" benchmarks that current AI can at least have some success with. But these types of benchmarks only broadcast the fact that we know that current and near future AI would fail horribly at any actual AGI task we threw at it.
The Turing Test was great until something that passed it (with an average human as interrogator) turned out to also not be able to count letters in a word — because only a special kind of human interrogator (the "scientist or QA" kind) could even think to ask that kind of question.
I've seen claims of passing but it's always things like "with only 3 questions" or "with only 3 minutes of interrogation" or "With only questions about topic X". Those aren't Turing Tests. As an example, if you limit the test to short things than anything will pass "Limit to 1 word one question". User types "Hello", LLM response "Hi". PASS! (not!)
One could say that if there is maximally hard Turing test, and a "sloppy" Turing test, we are somewhere where the current models pass the sloppy version but not the maximally hard version.
An AI that happened to be able to pass the turing test would be pretty notable because it probably implies much more capabilities behind the scenes. The problem with, for example, LLMs, they're essentially optimized turing test takers. That's about all they can do.
Plus, I don't think any LLM will pass the turing test in the long term. Once something organically comes up that they aren't good at, it'll be fairly obvious they aren't human and the limits of context will also become apparent eventually.
Deleted Comment
We know what we need right now, the next step. That step is a machine that, when it fails, it fails in a human way.
Humans also make mistakes, and hallucinate. But we do it as humans. When a human fails, you think "damn, that's a mistake perhaps me or my friend could have done".
LLMs on the other hand, fail in a weird way. When they hallucinate, they demonstrate how non-human they are.
It has nothing to do with some special kind of interrogator. We must assume the best human interrogator possible. This next step I described work even with the most skeptic human interrogator possible. It also synergizes with the idea of alignment in ways other tests don't.
When that step is reached, humans will or will not figure out another characteristic that makes it evident that "subject X" is a machine and not a human, and a way to test it.
Moving the goalpost is the only way forward. Not all goalpost moves are valid, but the valid next move is a goalpost move. It's kind of obvious.
Edit: Probably not, since it was published less than a week ago :-) I’ll be watching for benchmarks.
In other words; in one question; is the current AI not already well beyond the “…cognitive versatility and proficiency of an uneducated child”? And when you consider that in many places like Africa, they didn’t even have a written language until European evangelists created it and taught it to them in the late 19th century, and they have far less “education” than even some of the most “uneducated” avg., European and even many American children, does that not mean that AI is well beyond them at least?
Frankly, as it seems things are going, there Is at the very least going to be a very stark shift in “intelligence” that even exceeds that which has happened in the last 50 or so years that have brought us stark drops in memory, literary knowledge, mathematics, and even general literacy, not to mention the ability to write. What does it mean that kids now will not even have to feign acting like they’re selling out sources, vetting them, contradicting a story or logical sequence, forming ideas, messages, and stories, etc.? I’m not trying to be bleak, but I don’t see tons simply resulting in net positive outcomes, and most of the negative impacts will also be happening below the surface to the point that people won’t realize what is being lost.
Try to reconcile that with your ideas (that I think are correct for that matter)
This is because I use "stupidity" as the number of examples some intelligence needs in order to learn from, while performance is limited to the quality of the output.
LLMs *partially* make up for being too stupid to live (literally: no living thing could survive if it needed so many examples) by going through each example faster than any living thing ever could — by as many orders of magnitude as there are between jogging and continental drift.
What's going on is AI fatigue. We see it everywhere, we use it all the time. It's becoming generic and annoying and we're getting bored of it EVEN though the accomplishment is through the fucking roof.
If elon musk makes interstellar car that can reach the nearest star in 1 second and priced it at 1k, I guarantee within a year people will be bored of it and finding some angle to criticize it.
So what happens is we get fatigued, and then we have such negative emotions about it that we can't possibly classify it as the same thing as human intelligence. We magnify the flaws and until it takes up all the space and we demand a redefinition of what agi is because it doesn't "feel" right.
We already had a definition of AGI. We hit it. We moved the goal posts because we weren't satisfied. This cycle is endless. The definition of AGI will always be changing.
Take LLMs as they exist now and only allow 10% of the population to access it. Then the opposite effect will happen. The good parts will be over magnified and the bad parts will be acknowledged and then subsequently dismissed.
Think about it. All the AI slop we see on social media are freaking masterpieces works of art produced in minutes what most humans can't even hope to come close to. Yet we're annoyed and unimpressed by them. That's how it's always going to go down.
We already had a definition of AGI. We hit it.
Are you sure about that? Which definition are you referring to? From what I can tell with Google and Grok, every proposed definition has been that AGI strictly matches or exceeds human cognitive capabilities across the board.
Generative AI is great, but it's not like you could just assign an arbitrary job to a present-day LLM, give it access to an expense account, and check in quarterly with reasonable expectations of useful progress.
I'm curious when and what you consider to have been the moment.
To me, the general in AGI means I should be able to teach it something it's never seen before. I don't think I can even teach an LLM something it's seen a million times before. Long division, for example.
I don't think a model that is solid state until it's "trained" again has a very good chance of being AGI (unless that training is built into it and the model can decide to train itself).
I'm not an expert, but my layman's understanding of AI was that AGI meant the ability to learn in an abstract way.
Give me a dumb robot that can learn and I should be able to teach it how to drive, argue in court, write poetry, pull weeds in a field, or fold laundry the same way I could teach a person to do those things.
(2) I was rebutting the paper's standard that AGI should be achieving the status of a well-educated adult, which is probably far, far too high a standard. Even something measured to a much lower standard--which we aren't at yet--would change the world. Or, going back to my example, an AI that was as intelligent as a labrador in terms of its ability to synthesize and act on information would be truly extraordinary.
I agree with this but also, the output is almost entirely worthless if you can’t vet it with your own knowledge and experience because it routinely gives you large swaths of incorrect info. Enough that you can’t really use the output unless you can find the inevitable issues. If I had to put a number to it, I would say 30% of what an LLM spits out at any given time to me is completely bullshit or at best irrelevant. 70% is very impressive, but still, it presents major issues. That’s not boredom, that’s just acknowledging the limitations.
It’s like designing an engine or power source that has incredible efficiency but doesn’t actually move or affect anything (not saying LLM’s are worthless but bear with me). It just outputs with no productive result. I can be impressed with the achievement while also acknowledging it has severe limitations
Any definition of AGI that allows for this is utterly useless:
> Me: Does adding salt and yeast together in pizza dough kill the yeast?
> ChatGPT: No, adding salt and yeast together in pizza dough doesn't kill the yeast.
(new chat)
> Me: My pizza dough didn't rise. Did adding salt and yeast together kill the yeast?
> ChatGPT: It's possible, what order did you add them in?
> Me: Water, yeast, salt, flour
> ChatGPT: Okay, that explains it! Adding the salt right after the yeast is definitely the issue.
(It is not the issue)
Americans were glued to their seats watching Apollo 11 land. Most were back to watching I Dream of Jeanie reruns when Apollo 17 touched down.
Dead Comment
Dead Comment
Dead Comment
Things like chess-playing skill of a machine could be bench-marked against that of a human, but the abstract feelings that drive reasoning and correlations inside a human mind are more biological than logical.
We can easily program them to have human desires instead.
My emotions are definitely a function of the chemical soup my brain is sitting in (or the opposite).
We only have one good example of consciousness and sentience, and that is our own. We have good reason to suspect other entities (particularly other human individuals, but also other animals) have that as well, but we cannot access it, and not even confirm its existence. As a result using these terms of non-human beings becomes confusing at best, but it will never be actually helpful.
Emotions are another thing, we can define that outside of our experience, using behavior states and its connection with patterns of stimuli. For that we can certainly observe and describe behavior of a non biological entity as emotional. But given that emotion is something which regulates behavior which has evolved over millions of years, whether such a description would be useful is a whole another matter. I would be inclined to use a more general description of behavior patterns which includes emotion but also other means of behavior regulators.
there are many parts of human cognition, phycology etc. especially related to consciousness that are known unknowns and/or completely unknown.
a mitigation for this isaue would be to call it generally applicable intelligence or something, rather than human like intelligence. implying ita not specialized AI but also not human like. (i dont see why it would need to be human like, because even with all the right logic and intelligence a human can still do something counter to all of that. humans do this everyday. intuitive action, or irrational action etc.
what we want is generally applicable intelligence, not human like intelligence.
How does a computer with full AGI experience the feeling of butterflies in your stomach when your first love is required?
How does a computer experience the tightening of your chest when you have a panic attack?
How does a computer experience the effects of chemicals like adrenaline or dopamine?
The A in AGI stands for “artificial” for good reason, IMO. A computer system can understand these concepts by description or recognize some of them them by computer vision, audio, or other sensors, but it seems as though it will always lack sufficient biological context to experience true consciousness.
Perhaps humans are just biological computers, but the “biological” part could be the most important part of that equation.
Personally, I remain skeptical that is the case.
What does seem likely is that “intelligence” will eventually be redefined to mean whatever we got out of the AI buildup.
There's really no reason to believe intelligence is tied to being human. Most of us accept the possibility (even the likelihood) of intelligent life in the universe, that isn't.
>human intelligence as something detached from human biology.
I don't completely agree with the previous comment, but there is something to be considered to their statement.
Feels good so we want more so you arrange your whole life and outlook to make more feel good happen. Intelligence!
Definition of artificial:
>Made by humans, especially in imitation of something natural.
>Not arising from natural or necessary causes; contrived or arbitrary.
Thus artificial intelligence must be the same as natural, the process of coming up with it doesn't have to be natural. What this means: we need to consider the substrate that makes natural intelligence. They cannot be separated willy nilly without actual scientific proof. As in, we cannot imply a roll of cheese can manifest intelligence based on the fact that it recognizes how many fingers are in an image.
The problem arises from a potential conflict of interests between hardware manufacturer companies and definition of AGI. The way I understand it, human like intelligence cannot come from algorithms running on GPUs. It will come from some kind of neuromorphic hardware. And the whole point of neuromorphic hardware is that it operates (closely) on human brain principles. Thus, the definition of AGI MUST include some hardware limitations. Just because I can make a contraption "fool" the tests doesn't mean it has human like cognition/awareness. That must arise from the form, from the way the atoms are arranged in the human brain. Any separation must be scientifically proven. Like if anyone implies GPUs can generate human like self awareness that has to be somehow proven. Lacking a logical way to prove it, the best course of action is to closely follow the way the human brain operates (at least SNN hardware).
>The resulting AGI scores (e.g., GPT-4 at 27%, GPT-5 at 57%) concretely quantify both rapid progress and the substantial gap remaining before AGI.
This is nonsense. GPT scores cannot decide AGI level. They are the wrong algorithm running on the wrong hardware.
I have also seen no disclosure on conflict of interests in the paper.
Dead Comment
Which is it??
This isn’t meant to be a mystical statement that it’s magic that makes humans intelligent or some exotic process impossible to compute. But that the nature of our mind is not observable in its entirety to us sufficient that the current learned reinforcement techniques can’t achieve it.
Try this exercise. Do not think and let your mind clear. Ideas will surface. By what process did they surface? Or clear your mind entirely then try to perform some complex task. You will be able to. How did you do this without thought? We’ve all had sudden insights without deliberation or thought. Where did these come from? By what process did you arrive at them? Most of the things we do or think are not deliberative and definitely not structured with language. This process is unobservable and not measurable, and the only way we have to do so is through imperfect verbalizations that hint out some vague outline of a subconscious mind. But without being able to train a model on that subconscious process, one that can’t be expressed in language with any meaningful sufficiency, how will language models demonstrate it? Their very nature of autoregressive inference prohibits such a process from emerging at any scale. We might very well be able to fake it to an extent that it fools us, but awareness isn’t there - and I’d assert that awareness is all you need.
AGI is not here with LLMs, but its not because they lack reasoning ability. It's due to something different. Here is what I think is truly missing: continuous learning, long term memory, and infinite and efficient context/operation. All of these are tied together deeply, and thus I believe we are but a simple breakthrough away from AGI.
The idea of awareness being propagations through the NN is an interesting concept though. I wonder if this idea be proven through monitoring the electrical signals within the brain.
This is just a claim you are making, without evidence.
The way you understand awareness is not through "this is like that" comparisons. These comparisons fall over almost immediately as soon as you turn your attention to the mind itself, by observing it for any length of time. Try it. Go observe your mind in silence for months. You will observe for yourself it is not what you've declared it to be.
> An LLM does all of these things, and more, and you can prove this by how complex their output is.
Complex output does not prove anything. You are again just making claims.
It is astoundingly easy to push an LLM over to collapse into ungrounded nonsense. Humans don't function this way because the two modes of reasoning are not alike. It's up to those making extraordinary claims to prove otherwise. As it is, the evidence does not exist that they behave comparably.
Thank you by saying that. I think most people have an incomplete mental model for how LLMs work. And it's very misleading for understanding what they really do and can achieve. "Next token prediction" is done only at the output layer. It's not what really happens internally. The secret sauce is at the hidden layers of a very deep neural network. There are no words or tokens inside the network. A transformer is not the simple token estimator that most people imagine.
There is little discussion of how AI will enhance (or destroy) our emotional intelligence, or our naturalistic, intrapersonal or interpersonal intelligences.
Most religions, spiritual practices and even forms of meditation highlight the value of transcending mind and having awareness be present in the body. The way AGI is described, it would seem transcendence may be treated as a malfunction or bug.
[1] https://en.wikipedia.org/wiki/Theory_of_multiple_intelligenc...
There are people that have a hard time recognizing/feeling/understanding other people as "aware". Even more about animals.
My measurement of outside intelligence is limited by my intelligence. So I can understand when something is stupider compared to me. For example, industrial machine vs human worker, human worker is infinitely more intelligent compared to machine, because this human worker can do all kinds of interesting stuff. this metaphorical "human worker" did everything around from laying a brick to launching a man to the Moon.
....
Imagine Super-future, where humanity created nanobots and they ate everything around. And now instead of Earth there is just a cloud of them.
These nanonobots were clever and could adapt, and they had all the knowledge that humans had and even more(as they were eating earth a swarm was running global science experiments to understand as much as possible before the energy ends).
Once they ate the last bite of our Earth(an important note here: they left an optimal amount of matter to keep running experiments. Humans were kept in a controlled state and were studied to increase Swarm's intelligence), they launched next stage. A project, grand architect named "Optimise Energy capture from the Sun".
Nanobots re-created the most efficient ways of capturing the Sun energy - ancient plants, which swarm studied for centuries. Swarm added some upgrades on top of what nature came up with, but it was still built on top of what nature figured by itself. A perfect plant to capture the Sun's energy. All of them a perfect copy of itself + adaptive movements based on their geolocation and time(which makes all of them unique).
For plants nanobots needed water, so they created efficient oceans to feed the plants. They added clouds and rains as transport mechanism between oceans and plants... etc etc.
One night the human, which you already know by the name "Ivan the Liberator"(back then everyone called him just Ivan), didn't sleep on his usual hour. Suddenly all the lights went off and he saw a spark on the horizon. Horizon, that was strongly prohibited to approach. He took his rifle, jumped on a truck and raced to the shore - closest point to the spark vector.
Once he approached - there was no horizon or water. A wall of dark glass-like material, edges barely noticeable. Just 30 cm wide. On the left and on the right from a 30 cm wide wall - an image as real as his hands - of a water and sky. At the top of the wall - a hole. He used his gun to hit the wall with the light - and it wasn't very thick, but once he hit - it regenerated very quickly. But once he hit a black wall - it shattered and he saw a different world - world of plants.
He stepped into the forest, but these plants, were behaving differently. This part of the swarm wasn't supposed to face the human, so these nanobots never saw one and didn't have optimised instructions on what to do in that case. They started reporting new values back to the main computer and performing default behaviour until the updated software arrived from an intelligence center of the Swarm.
A human was observing a strange thing - plants were smoothly flowing around him to keep a safe distance, like water steps away from your hands in a pond.
"That's different" thought Ivan, extended his hand in a friendly gesture and said - Nice to meet you. I'm Ivan.
....
In this story a human sees a forest with plants and has no clue that it is a swarm of intelligence far greater than him. To him it looks repetitive simple action that doesn't look random -> let's test how intelligent outside entity is -> If entity wants to show its intelligence - it answers to communication -> If entity wants to hide its intelligence - it pretends to be not intelligent.
If Swarm decides to show you that it is intelligent - it can show you that it is intelligent up to your level. It won't be able to explain everything that it knows or understands to you, because you will be limited by your hardware. The limit for the Swarm is only computation power it can get.
People want autonomy, self-learning, consistent memory and perhaps individuality (in the discernability/quirkiness sense), but still morally unencumbered slaves.
As long as you cannot define it clearly or even show that you yourself have this quality, I think the burden of proof is on you to show why this has any real world implications rather than just being word play. We can build thinking, reasoning machines just fine without waiting for philosophers to finally answer what consciousness is.
I do not have any even remotely practical definition for this, but this has to somehow involve the system being in a closed loop. It has to "run" in a sense that an operating system runs. Even if there is nothing coming on certain inputs it still has to run. And probably hallucinate (hehe) like humans do in an absence of a signal or infer patterns where there are none, yet be able to self-reflect that it is in fact a hallucination
This seems like an unsupported assertion. LLMs already exhibit good functional understanding of and ability in many domains, and so it's not at all clear that they require any more "awareness" (are you referring to consciousness?) than they already have.
> the spark of awareness required to be intelligent.
Again, this seems like an assumption - that there's some quality of awareness (again, consciousness?), that LLMs don't have, that they need in order to be "intelligent". But why do you believe that?
> We’ve all had sudden insights without deliberation or thought.
Highly doubtful. What you mean is, "without conscious thought". Your conscious awareness of your cognition is not the entirety of your cognition. It's worth reading a bit of Dennett's work about this - he's good at pointing out the biases we tend to have about these kinds of issues.
> We might very well be able to fake it to an extent that it fools us
This leads to claiming that there are unobservable, undetectable differences. Which there may be - we might succeed in building LLMs that meet whatever the prevailing arbitrary definition of intelligence is, but that don't possess consciousness. At that point, though, how meaningful is it to say they're not intelligent because they're not conscious? They would be functionally intelligent. Arguably, they already are, in many significant ways.
https://en.wikipedia.org/wiki/Theory_of_mind
Not to pile on, but isn't this actually a distinct example of _lack_ of awareness? As in, our brains have sparks of creativity without understanding the inception of those sparks?
Perhaps I'm conflating some definition of "aware" with another definition of "awareness"?
I hear there's work being done on getting the world models out, distilling the 'cortex-core' (aka the thinking without data), to perhaps see if they're capable of more, but so far we're looking at holograms of wishful thinking that increase in resolution, but still lack any essence.
This begs a question - can true intelligence even be artificial?
we only need to fake it to the point it's undistinguishable from the carbon based one.
faking is all you need.
1. "Too" Broad, which raises uncomfortable questions about non-human intelligence and how we as humans treat them (see: whales, elephants, octopuses/cephalopods)
2. Too narrow, which again raises very uncomfortable issues about who and who does not qualify as human, and what we do with them.
Put in other words, it's more an issue of ethics and morals than it is definitional.
I guess by 'AGI' most people mean human level or above so I guess you'd want human level awareness which Teslas and the like don't have yet.
>This makes it to my mind impossible to train a model to be aware
feels wrong. If you're arguing that human's are aware, then it is apparent that it is possible to train something to be aware. Nature doesn't have any formal definition of intelligence, or awareness, yet here we are.
From a practical perspective, it might be implausibly difficult to recreate that on computers, but theoretically, no reason why not.
My definition of a complete AGI is: an AI that can read JIRA tickets, talk with non-programmers and do all my job and get me and all/most software engineers fired and proves sustainable.
But in general, it's an AI that can do any remote-work just as good as humans.
This paper promises to fix "the lack of a concrete definition for Artificial General Intelligence", yet it still relies on the vague notion of a "well-educated adult". That’s especially peculiar, since in many fields AI is already beyond the level of an adult.
You might say this is about "jaggedness", because AI clearly lacks quite a few skills:
> Application of this framework reveals a highly “jagged” cognitive profile in contemporary models.
But all intelligence, of any sort, is "jagged" when measured against a different set of problems or environments.
So, if that’s the case, this isn’t really a framework for AGI; it’s a framework for measuring AI along a particular set of dimensions. A more honest title might be: "A Framework for Measuring the Jaggedness of AI Against the Cattell–Horn–Carroll Theory". It wouldn't be nearly as sexy, though.
I think people get really uncomfortable trying to even tackle that, and realistically for a huge set of AI tasks we need AI that are more intelligent than a huge subset of humans for it to be useful. But there are also a lot of tasks where AI that is not needed, and we "just" need "more human failure modes".
If you compare the performance of the average human to a state-of-the-art AI model trained by top experts with a big budget, you can't make any conclusions about intelligence. For the comparison to make sense, the human should also be trained as well as reasonably possible.
I do agree that it’s a weird standard though. Many of our AI implementations exceed the level of knowledge of a well-educated adult (and still underperform with that advantage in many contexts).
Personally, I don’t think defining AGI is particularly useful. It is just a marketing term. Rather, it’s more useful to just speak about features/capabilities. Shorthand for a specific set of capabilities will arise naturally.
On the other hand, research on "common intelligence" AFAIK shows that most measures of different types of intelligence have a very high correlation and some (apologies, I don't know the literature) have posited that we should think about some "general common intelligence" to understand this.
The surprising thing about AI so far is how much more jagged it is wrt to human intelligence
If you go beyond human species (and well, computers are not even living organisms), it gets tricky. Adaptability (which is arguably a broader concept than intelligence) is very different for, say octopodes, corvids and slime molds.
It is certainly not a single line of proficiency or progress. Things look like lines only if we zoom a lot.
Current AI is in its infancy and we're just throwing data at it in the same way evolution throws random change at our DNA and sees what sticks.
This paper, for example, uses the 'dual N-back test' as part of its evaluation. In humans this relates to variation in our ability to use working memory, which in humans relates to 'g'; but it seems pretty meaningless when applied to transformers -- because the task itself has nothing intrinsically to do with intelligence, and of course 'dual N-back' should be easy for transformers -- they should have complete recall over their large context window.
Human intelligence tests are designed to measure variation in human intelligence -- it's silly to take those same isolated benchmarks and pretend they mean the same thing when applied to machines. Obviously a machine doing well on an IQ test doesn't mean that it will be able to do what a high IQ person could do in the messy real world; it's a benchmark, and it's only a meaningful benchmark because in humans IQ measures are designed to correlate with long-term outcomes and abilities.
That is, in humans, performance on these isolated benchmarks is correlated with our ability to exist in the messy real-world, but for AI, that correlation doesn't exist -- because the tests weren't designed to measure 'intelligence' per se, but human intelligence in the context of human lives.
It makes me think of every single public discussion that's ever been had about quantum, where you can't start the conversation unless you go through a quick 101 on what a qubit is.
As with any technology, there's not really a destination. There is only the process of improvement. The only real definitive point is when a technology becomes obsolete, though it is still kept alive through a celebration of its nostalgia.
AI will continue to improve. More workflows will become automated. And from our perception, no matter what the rapidness of advancement is, we're still frogs in water.
It's a very emotional topic because people feel their self image threatened. It's a topic related to what is the meaning of being human. Yeah sure it should be a separate question, but emotionally it is connected to it in a deep level. The prospect of job replacement and social transformation is quite a threatening one.
So I'm somewhat understanding of this. It's not merely an academic topic, because these things will be adopted in the real world among real people. So you can't simply make everyone shut up who is an outsider or just heard about this stuff incidentally in the news and has superficial points to make.
Surely you can appreciate that if the next stop on the journey of technology can take over the process of improvement itself that would make it an awfully notable stop? Maybe not "destination", but maybe worth the "endless conversation"?
Also, weird to see Gary Marcus and Yoshua Bengio on the same paper. Who really wrote this? Author lists are so performative now.
Just like an airplane doesn't work exactly like a bird, but both can fly.
[1] https://andreinfante.substack.com/p/when-will-ai-transform-t...
Assume the Riemann hypothesis is false. Then, consider the proposition "{a|a∉a}∈{a|a∉a}". By the law of the excluded middle, it suffices to consider each case separately. Assuming {a|a∉a}∈{a|a∉a}, we find {a|a∉a}∉{a|a∉a}, for a contradiction. Instead, assuming {a|a∉a}∉{a|a∉a}, we find {a|a∉a}∈{a|a∉a}, for a contradiction. Therefore, "the Riemann hypothesis is false" is false. By the law of the excluded middle, we have shown the Riemann hypothesis is true.
Naïve AGI is an apt analogy, in this regard, but I feel these systems aren't simple nor elegant enough to deserve the name naïve.