For example, while you can get it to predict good chess moves if you train it on enough chess games, it can't really constrain itself to the rules of chess. (https://garymarcus.substack.com/p/generative-ais-crippling-a...)
For example, while you can get it to predict good chess moves if you train it on enough chess games, it can't really constrain itself to the rules of chess. (https://garymarcus.substack.com/p/generative-ais-crippling-a...)
so it seems kind of pointless. I would imagine it could ingest soap or a module definition or swagger just as easily and still make calls.
if so how much do we need to maintain to keep that kind of system up an running.
No persistent storage and other limitations make it just a toy for now but we can imagine how people will just create their own Todo apps, gym logging apps and whatever other simple thing.
no external API access currently but when that's available or app users can communicate with other app users, some virality is possible for people who make the best tiny apps.
weather, todo list, shopping list, research tasks, email someone, summarize email, get latest customized news, RSS feed summary, track health stats, etc.
Just like an LLM can vibe code a great toy app, I don’t think an LLM can come to close to producing and maintaining production ready code anytime soon. I think the same is true for iterating on thinking machines
I'm not sure how much an agent could do though given the right tools. access to a task mgt system, test tracker. robust requirements/use cases.
But even with these it does not feel like AGI, that seems like the fusion reactor 20 years away argument, but instead this is coming in 2 years, but they have not even got the core technology of how to build AGI