BabyAGI: An Autonomous and Self-Improving Agent

Has anyone got this to actually do anything at all yet?

I see all of the half demos where it doesn't complete anything, I've tried it myself and.. well, if we're being honest it was shite. I've seen a whole load of tweet threads saying what it could be used for..

Literally just looking for one example of a successful run. Anything at all.

I can definitely see that there may be potential (if not this then the ideas that come off the back of this) but even I don't have a real use case for it yet, I'm just tinkering.

I guess my XY question: Am I being suckered into the web3 of AI? Lots of buzz, no use case.

jazzyjackson · 3 years ago

There is some magical thinking at play, that once you have what appears to be an intelligence of some kind, simply allowing it to recursively self-critique will lead to singularity, but the other possibility is the signal just degrades, like deep-fried jpgs.

corobo · 3 years ago

My testing would definitely lean towards deep fried jpgs aye. Every time it seemed to hit a bump somewhere then got caught up in a loop of googling whatever problem it was having and not finding an answer it was looking for, then going on to look into that problem, etc, etc. I guess you could put a step in to allow the human to nudge the AI back on track but then it's not really autonomous.

Also it's really slow at doing anything, even before it goes off into the weeds. I understand it's new and wont be polished but man. Really slow.

At least it's a decent motivator! "Oooh this is a task AutoGPT/BabyAGI could do... ehh it'd be quicker to just do it myself"

I'll keep an open mind, still eager to see a successful attempt at something. That'd at least open up a chance I'm doing it wrong rather than it not doing anything useful.

Buttons840 · 3 years ago

This has been reinforcement learning for decades. People say "OMG! The agent is acting and learning and getting better all on its own!" And they're right, in the beginning. Eventually the agent plateaus or collapses.

Supervised learning is much more stable. GTP is supervised learning. Once you start letting the agent choose or modify its own training data, then you're moving towards the much less stable realms of reinforcement learning.

qualudeheart · 3 years ago

There are use cases as far as the eye can see. How about copywriting?

corobo · 3 years ago

Could you show me the log of it working?

I'm asking for actual uses, not theoretical. I can, and have, come up with theoretical uses myself but when I test them nothing has resulted in success.

So I’m a tech noob, but I recently finished the Lex Friedman podcast with Max Tegmark. This is a serious person with strong credentials ringing the warning bell about AI. Yet, a lot of people in my tech circle seem to swing between being unconcerned and unimpressed by AI.

Where exactly are we with AI as a legitimate threat if we continue down our current path? Are people like Max just jockeying for attention? Or is there merit to their concerns?

GeorgeTirebiter · 3 years ago

I like Max, and he's definitely a very smart guy. He knows how AI works. Yet I still think this argument is like snake oil in this way:

1. AI kills us all; Max: "See, I told you! ayeee!"

2. AI integrates smoothly into Society. Max: "Sure was good I got you worried about the risks, so we avoided them!"

3. AI keeps improving, little by little, causing hiccups along the way but nothing disastrous; Max: "Just wait, there will be doooom!"

He can't lose with this type of argument. Heads he wins, tails he wins.

spacetime_cmplx · 3 years ago

You could apply the same logic to anyone ringing the alarm bells about climate change (or alarm bells in general). Just because snake oil is usually unfalsifiable doesn't mean everything _currently_ unfalsified is snake oil.

Kinrany · 3 years ago

Huh? In 1 he dies. In 2 and 3 he becomes irrelevant eventually and can only claim credit to the extent his predictions were true, while fighting off all the actual snake oil salesmen claiming that all of this was obvious.

darkwater · 3 years ago

Well, just like in every other case where acting to prevent something will avoid that something to happen. Accident prevention, quit smoking to avoid cancers, vaccines against a pandemic etc etc

hervature · 3 years ago

Asking this openly, even on HN, is going to get a lot of unqualified answers. So I'll preface that I have a recent PhD in deep RL and pretty well versed on the cutting edge develops of ML.

I think your question has two angles. First, do LLMs have potential for AGI? I think that's an emphatic no. There is nothing special about this generative model vs. say something like sampling from a mixture of Gaussians. Much better generations and super impressive it works, sure, but there is no mechanism for it to improve itself let alone change its "prime directive". See Sam Altman's claim that RL with human feedback being where most are the gains are now.

At a higher level, there is the concern that the pace at which we are going that we can extrapolate that AGI is around the corner. My take on this is that basically everything made possible in the last 20 years is because of GPUs and better tooling. In specific, the recent hype we see is because of how democratized things are getting. We have kids using ChatGPT to do homework. This is very disruptive from a societal perspective but we are essentially at the height of the rate at which this technology is being adopted. The growth rate will decelerate and it will stop being news once society has learned to adapt to the new technology. However, from a technical perspective, these concerns are like looking at children playing legos and being concerned they will build the next nuclear bomb. Hypothetically feasible but there are so many fundamental gaps that clearly separate the two that the real concerns would be when we see Manhattan projects being successful. From an outsider's perspective, Deep Mind and AlphaZero or OpenAI and LLMs seem like project Manhattans but the fact these companies have spent billions with no returns yet should say a lot about the utility of these models.

93po · 3 years ago

I fall entirely in the unqualified category, but I will add that there are people as equally or more credentialed than you that disagree. I don't say this to be snotty, but just to add context for other people reading that there is a wide spectrum of beliefs among people qualified to make such statements

> My take on this is that basically everything made possible in the last 20 years is because of GPUs and better tooling.

I feel like part of the optimism on the part of people like Sam Altman is that it's true the recent advances are made possible due to computing power available, but the tools that we will now be able to create to develop AI systems better and faster is what will enable us to maintain the acceleration we're seeing right now. As a simple example, we've massively reduced the time and cost of training in only a couple-few years.

spaceman_2020 · 3 years ago

Am I wrong in thinking that if GPT-4 gets trained on all your organization-wide data, it will become much better at, say, coding for your organization-specific apps?

Like right now, you ask it to code and it give you a generic, contextless function. But if it were trained on your company’s data, it should be able to give you a function that’s relevant to your tech stack, existing codebase, organizatinal best practices, etc?

rolisz · 3 years ago

He's a physicist. What AI/ML credentials does he have?

bosquefrio · 3 years ago

https://github.com/SJ001/AI-Feynman

JamesBarney · 3 years ago

I really do believe world ending AI is on the other side of "can drive a car".

hammyhavoc · 3 years ago

Can you be more specific? A threat to what exactly?

YZF · 3 years ago

Creating a super-intelligence that kills all of us?

MrScruff · 3 years ago

Depends what you mean by threat. Could future LLMs industrialise misinformation disrupting the global political order? Maybe.

Can we scale up current architectures to beyond human intelligence without further advances in theory/methods? If I had to guess I’d say no.

Is another breakthrough of the order of transformers likely in the next decade? Who knows…

freediver · 3 years ago

This is not AGI by any stretch of imagination. It is does not even appear to be a step on the path to AGI.

Furthermore, due to autoregressive nature of GPT models, the more auto-gpt generates (the more it works, the more tasks it performs..) the chance of things going off the right path grow exponentially, and then it is 'doomed' to the end [1].

Thus, chance of this being actually useful for anything longer than what a simple prompt can already do with a tool like ChatGPT is very low.

The end result is an impressive concept but a practically unusable tool. And the problem, in general, is that as the auto-gpt improves (which it will at impressive pace), so will our ambition in using it, which will lead to constant disappointment and what we have today will be generally how we feel about it in the future. Always needing "just a bit more", but never really there.

We already have a "baby AGI" that has been deployed in production environment for a few years - it is called Tesla self driving. It was supposed to get us from point a to point b completely autonomously. And for 6 years now it has been almost "almost there", but never really there (and arguably never will be).

What this does though, is create and inflate a giant FOMO, and the best way of dealing with FOMOs (long term) is to stay on the firm ground, observe, wait for clarity and the right action.

[1] Watch in particular Yann LeCun's presentation at https://www.youtube.com/watch?v=x10964w00zk

brookst · 3 years ago

I kind of agree but I really don't see Tesla self-driving as even aspirational to AGI. It seems like the poster child for a domain specific AI that nobody has an interest in making general.

Arguably you need AGI to drive from point a to point b in human conditions (not in controlled conditions)

Can the exponential probability increase mentioned by LeCun be mitigated, for example with an approximation?

Lots of algorithms like nearest neighbor search are O(n^2) but algorithms for approximate results run in sublinear time.

lxe · 3 years ago

I've been using AutoGPT, which is the other ChatGPT automation tool, and frankly I'm a bit disappointed. It spends most of its time figuring out why system commands are failing and why javascript is blocked, and/or why selenium can't start.

Timon3 · 3 years ago

Oh god, it's really replacing me...

Usually I go to reddit for gems like this, but glad you saved me a trip.

yinser · 3 years ago

Just about did a spit take. I thought it sounded just like my day too!

Yeah, tried it with trivial goals like "make a 5 second .wav file with a 440hz tone", and after running for an hour and burning tons of openai requests, it had not figured out how to even pip install a module to create audio files.

It mostly seems like a massively stoned teenager trapped in thought loops about how to use thought loops to achieve something, without ever hitting on the idea of actually doing something.

dontwearitout · 3 years ago

I haven't used autoGPT but vanilla chatGPT can happily do this in 25 lines of python.

vessenes · 3 years ago

@dang might be worth linking to Yohei's original repo, which is definitely HN front page worthy: https://github.com/yoheinakajima/babyagi. This repo is not just a strict fork though, Oliveira seems to be working on making BabyAGI more autonomous.

bugglebeetle · 3 years ago

Why does every one of these example use pinecone when that costs money and Faiss[0] is free? If you’re going to let something run a bunch of fee-per-use API calls on its own, why double up on getting charged?

[0] https://github.com/facebookresearch/faiss

amrrs · 3 years ago

The code in this tutorial uses Faiss. Also you can experiment swapping the LLMs thanks to LangChain

https://youtu.be/WosPGHPObx8

Unless you have several hundred million documents, just write a simple encoder that serializes the embedding vectors to a flat binary file.

Writing code from scratch to process and search 200k unstructured documents -- parsing, cleaning, chunking, OpenAI embedding API, serialization code, linear search with cosine similarity, and the actual time to debug, test and run all this -- took me less than 3 hours in Go. The flat binary representation of all vectors is under 500 MB. I even went ahead and made it mmap-friendly for the fun of it even though I could read it into all into memory.

Even the dumb linear search I wrote takes just 20-30ms per query on my Macbook for the 200k documents. The search results are fantastic.

dnadler · 3 years ago

Less than 3 hours is impressive, but it took me less than 10 minutes to do the same with Chroma.

Could you share the code with us?

leobg · 3 years ago

Because FAISS and hnswlib don’t have “pine cronies” as advocates on HN, I guess.

digdugdirk · 3 years ago

This is run and accessed local-only, correct? I've been looking for a local-only vector database for LLM agent work like these.

Using langchain I’ve been introduced to chroma which uses a local store and it’s perfect. Check out their question and answer demo, the VectorstoreIndexCreator utilizes chroma and duckdb it seems. https://python.langchain.com/en/latest/use_cases/question_an...

Yup, it’s local and free! Best of both worlds!

SomaticPirate · 3 years ago

Also pg_vector seems to be gaining traction

omneity · 3 years ago

From people running GPT based loops like this one, can you share typical tasks you've been performing & the associated costs?

kmod · 3 years ago

Bold of you to assume that people actually run GPT based loops!

Yeah, the most likely scenario now is GPT stuck in a loop I guess?

aledalgrande · 3 years ago

Why using the AGI term? It is already overloaded