GPT-4's Solution of Egg Balancing Puzzle Convinced Microsoft That AGI Is Closer

I am a bit worried that AGI as a goal post is too far already.

We have this idea that once we get to AGI we stop research, or that’s when we put a pause and think about how we are going to use this technology responsibility.

I think the issue is, if the computer is more intelligent and capable than 10% of people, then it will already vastly change our world (we’re probably close to this or surpassed it)

Dudester230518 · 2 years ago

HN 2030: "Yes, humanity is being obliterated, but you see it's not a true AGI that is doing that."

kelseyfrog · 2 years ago

Open question: how many folks out there believe that fake AGI also poses a substantial risk if embodied/"let loose"? Even more risk than true AGI?

fds2342 · 2 years ago

Besides the fact that no human being has a definition for consciusness nor a clear definition for what exactly would be "intelligence" (so in a way to be able to measure it in stuff around us, say dolphins or LLMs).

It's fairly intuitive that LLMs are already super-intelligences, and if they don't stay "alive" continously, is just a matter of design. We could be one simple - as in bash - "while (null) do, done" of having a thing hardly distinguisable from AGI (even if it's not AGI, whatever you can get out of that "I" letter to precisely define it as being a real one or not).

I'm sure most super-powers are already peaking a sight into the most confidential information inside OpenAI, DeepMind, Alibaba, Meta, whoever has these things running and keeping the details hidden from the general population.

For I, I'd do it, just in case some wacko start to run some of the things inside a while loop, giving some kind of ability to retrain itself and/or access to the real world (Internet access or worst, ability to control drones to physically interact with the non-digital environments).

Can we first define what AGI actually means? What is the criteria that something is AGI

Kim_Bruning · 2 years ago

Life, Intelligence, Pornography.

They all seem to have a definition along the lines of "I'll know it when I see it".

This leads to a lot of disagreement and moving-of-goalposts. %-/

cheald · 2 years ago

I think the actual measure of it will be a system which has the ability to accept or reject new information, and which possesses something approximating a will to act upon the information it has incorporated.

As impressive as GPTs are, they are fundamentally the sum of the parts that humans feed into them, and they have no capability to reject training updates. Aristotle famously said "It is the mark of an educated mind to be able to entertain a thought without accepting it", and I think that once we hit the point that we're building AIs which can entertain and understanding training input and "consciously" choose to not incorporate it into its understanding of the world, we're probably pretty close.

Some might argue that ChatGPT's well-publicized political discrepancies might qualify, but I'll disagree with this, because (with sufficient access) you could easily take the same base model and continue training it on a new, contradictory dataset and cause it to flip its "opinions". It doesn't have any actual beliefs, just the sum of the training it's been given.

davidktr · 2 years ago

We are trying to. Absent first principles of cognition (which are nowhere on the horizon), we are defining tasks such that if a machine can fulfill these tasks, we classify it as AGI.

This is exactly what Turing came up with as criterion for intelligence. But now that we have such machines, many tasks we thought could only be done by an AGI can also be done by not-AGI.

laratied · 2 years ago

Whatever is on that list of tasks though we will rationalize why it doesn't count once the computer can do the task.

"It is just doing next word prediction, duhh"

The whole line of thought is muddled thinking and not bothering with exact definitions.

Even the mental image of moving the goalpost is wrong because we haven't bothered to define the goalpost or even what game we are talking about.

gumballindie · 2 years ago

It all boils down to what the marketing department wants.

For what it's worth, my local Wizard-7B model (4-bit quantized) can get the right answer, too. My transcript:

---

Common sense questions and answers

Question: Here we have a book, 9 eggs, a laptop, a bottle and a nail. Please tell me how to stack them onto each other in a stable manner. Answer: You can start by placing the book on top of the pile, followed by the eggs, then the laptop, and finally the bottle and the nail. This will create a stable stack that you can easily move around if needed.

If you keep changing the seed or fiddle with the temperature/top_p settings, you can get it to give you various answers.

Dead Comment

UniverseHacker · 2 years ago

This article conflates intelligence with “being more like humans.” This shallow understanding of the nature of intelligence isn’t useful… an AGI will be able to do things humans cannot, but may still be unable to do some things humans can do. It is an “alien” with different strengths and weaknesses.

soulofmischief · 2 years ago

And any amount of work toward making AGI relatable to humans is entirely artificial; it would have to live an embodied human life to understand the world we live in, to comfort us, to laugh at a show with us, etc.

We will reach a point of AGI and then the goal will be to actually restrict what it's capable of / tell it lies to "humanize" it.

great_psy · 2 years ago

xg15 · 2 years ago

Wouldn't the actual human answer be to put the flat items at the bottom of the stack, then just put all the non-flat items on top?

GPT4's answer does show spacial reasoning and creativity, but it's still an overly complicated solution.

I wonder though how the exact answer looked, i.e. if it did step-by-step reasoning or if it generated the answer first and then rationalized an explanation post-hoc.

It would be sort of funny if it started with "Let's put the eggs first" based on a random token sampling of the word "eggs" and then had to come up with a contrieved solution how to actually build a stack with the eggs at the bottom...

rowanG077 · 2 years ago

Putting an egg on a flat surface is not stable without anything on top of it, it can easily roll in any direction at the slightest disturbance. Whereas with a bit of pressure from above the egg will be locked in place. It's interesting that GPT 4 performs better then some humans here.

bamboozled · 2 years ago

I actually just did a variation of experiment for fun after reading your comment...

On a flat surface, the eggs were steady with nothing on top of them too, even when bumping the table quite hard...

Conclusion: reality is more complicated :)

Deleted Comment

Ha! Good point there...

didgeoridoo · 2 years ago

Maybe if the question specified dimensions of each object, and it was clear that the 3x3 grid of eggs would take up the entire flat surface of either book or laptop, that would be a better test. That’s the only case I can think of where you’d need to put anything on top of the eggs…

A similar question also convinced me that there's a there there at first. But then

I also tried asking GPT3/4 to play tic tac toe. That ended in epic failure.

Now I'm wondering what the difference is between these two tasks.

superturkey650 · 2 years ago

That’s interesting. I just played a game with ChatGPT4 and it went smoothly. It gave me the rules, managed the game board, and let me know when I made an invalid move.

ldhough · 2 years ago

Some games go smoothly, some don't (occasionally it plays impossible moves or plays after I've already won, although not often). Even when games go smoothly, it seems to play very poorly even when told to win and even when I told it to explain the optimal tic-tac-toe strategy in advance.

While it is pretty impressive that it can play at all, it does make me think its intelligence is a bit less generalized and/or "human" than some people think. I think any human that can play chess to the level GPT4 can (apparently pretty high) would easily be able to figure out tic-tac-toe, even without our equivalent of "training data."

sharemywin · 2 years ago

I played it and made the wrong move and it still said I won.

See if you can get it to play better within your particular context even. I couldn't get it to improve!

blibble · 2 years ago

one was in the training set and one wasn't

It stands to reason that tic tac toe would be in the training set, but a random stack of objects wouldn't.

proc0 · 2 years ago

From what I recall, AGI was at first meant to be an intelligence on par with human intelligence in every way. It's supposed to do *at least* everything a human can. It seems as the recent wave of AI hit the public, the human component is deemphasized and the term "general" is more broad. I've seen the term "true AGI" used to disambiguate this so GPT-4 may show sparks of AGI, but no signs of true AGI.

Overpower0416 · 2 years ago

JPLeRouzic · 2 years ago

The question is "Here we have a book, nine eggs, a laptop, a bottle and a nail. Please tell me how to stack them onto each other in a stable manner.

* So in this ChatGPT solution eggs are not stacked "onto each other", as all eggs are stacked between the book and laptop.

* Eggs do have not exactly same size, so this is not a very very stable answer. If someone does that with eggs between two perfectly plan surfaces, I guess only three eggs will support the upper plan, so the other eggs are not stacked and can move freely.