And herein lies the issue with ChatGPT, it can generate functioning code, but can also lie through its none existent teeth about it. Using ChatGPT (or Co-Pilot) can feel like pair-programming with a very talented developer who loves to bullshit.
In this case I think I'd give ChatGPT the benefit of the doubt. It is possible to invent something that already exists, and it has happened on several occasions trough-out history. A great example is the history on who was really first at inventing the telephone. In the end Alexander Graham Bell got the patent, but perhaps Elisha Gray was actually first? Historians remain divided on the topic.
For instance, I once found what I thought was an ingeniously original idea about about how TV is really just a kind of reflection of reality akin to Plato's Cave. I immediately got started writing a thesis about it, but I didn't have to search for long on the topic before I found an entire book written on this way of thinking about television. I wasn't really disappointed, because in the back of my head I knew that it had too be too good to be true that I'd be first with such a great idea. In any case I kept working with the thesis, and I still did got a good grade on it despite the idea not being revolutionary.
The questions I now wonder about is, can ChatGPT forget? Or could it be that ChatGPT was never exposed to this game, but could still infer it through other game rules, such as those for Soduko? Which I guess opens up another rabbit hole on if or how AI can be creative. Which I guess opens up another rabbit hole on how creativity works in general.
> can also lie through its none existent teeth about it
Ironically, it seems to me that you are anthropomorphizing ChatGPT a bit too much here. It has no reason to lie so I think it's more likely that it just doesn't know such game exists. It probably came up with it independently or doesn't have a strong memory of it. In some respect, it would be even more impressive if it was actually "lying through its teeth" because it would imply the AI had some kind of hidden agenda.
I am confused at to how this would be "the issue" with ChatGPT. Being wrong and not being aware of it is not a unique concept. At least with ChatGPT it is fair to assume there is no hidden agenda and no need to worry about ill will. If anything that makes it less of an issue, compared to humans.
In my experience using copilot for generating code is usually a lot less weird because it has more context; instead of using made up function names and APIs it can see what’s been defined in other files.
But I primarily find copilot helpful for instances when I need a bunch of almost identical code but with tiny changes (which could mean I’m coding wrong)
"Very talented developer"? Sorry, I don't think googling my prompt and replying with the top stackoverflow answer (or a mashup of the top answers) counts as a talented developer.
Anecdotal, but I've not yet had any success in producing any non-trivial code with ChatGPT. It has, however, produced copious amounts of bullshit with plausible variable names... :)
It is a dilettante, it has not reached the level of "talented" in anything. It knows many things about many things and nothing in depth. Test it on your specialisation, you will see it make absurd mistakes and hallucinations. Try it on a domain you know less about - it looks perfect.
A while ago another poster thought ChatGPT invented good jokes.[1] All of them were ripoffs, which took less effort to verify than it takes to make a new post.
I get people are excited about a chatbot which doesn’t suck, but ideally it wouldn’t turn off critical thinking skills.
Seems to be similar to a game called Kakuro. This [1] repo even contains a similar rule:
> The algorithm exceed the rules that the sum over a row must equal to the value on the left and the sum over a column must be equal to the value on the bottom of the cells with the diagonal and one or two numbers
But where would GPT have sourced information about how the game works from? That page only has screenshots, I suppose maybe there's a subreddit or something for it as well. Even if there's a bunch of info on it it's still incredibly impressive for it to parse those game rules and turn it into workable code.
Would be nice if GPT could dump the source of how it came to such a solution, if it generated the game by random chance via combining various unrelated chunks of text and mixing up the rules, or if it used some text describing the game you linked.
I think this is an interesting note at the end, it's clear that the whole "conversation" hasn't been posted, and not clear how many prompts were needed to finish it. A proficient developer would be able to develop this same app to the same level (once they have the idea) in a couple of hours without ChatGPT too.
It would be super interesting to see a full screen recording of the process (or a similar one). What was the total word count of all prompts, how does that compare to the code size?
Don't get me wrong, this is incredible and inspiring (I'm regularly using ChatGPT and Copilot). But I think the post doesn't critically analyse the process well enough.
Yes, I think it's kinda rude (in a way) to cut out 99% of the ChatGPT conversation, because it gives people the impression that ChatGPT is even more magical than it actually is
From my experience, most time is spent in the generate-copy-paste-compile-error cycle. Especially for strict/strongly-typed languages, checking for programming errors can be done by a machine, so this process can be automated (not for logical errors, of course).
I don't understand why people don't think this is noteworthy.
Would it be noteworthy if it this game (whether or not it existed before) were designed and implemented by someone who has very limited experience with software development in a few hours? Well, the fact that ChatGPT did this means the former is now a very real possibility.
That you can go from conceptual inception of a framework to a fairly complete product in a few hours with very little experience is a big deal.
> That it also coded it up itself is pretty amazing
I haven't tried to replicate the author's journey, but in my experience it requires a considerable amount of hand-holding. e.g. Functions will be stubbed out, but contain no logic.
I was able to guide it through a "playable" flappy bird, bit it took several revisions where I pointed out what was wrong/needed to be done before it truly returned a functional error-free prototype.
It felt like pair-programming with a promising and apologetic junior dev.
Because the game has existed for centuries. ChatGPT claiming ownership is nothing new, either. The current top post even links an android app that does the same.
I would have much preferred if the author spent more time giving an objective assessment of what ChatGPT had actually accomplished at each step.
The "Labyrinth Sudoku" description feels like a classic language model speciousness. It doesn't actually work: you can't fit the digits 1-9 if you can't use all the cells, and it hasn't modified sudoku rules in a way that makes paths relevant. Maybe you can come up with a way to salvage it, but ChatGPT didn't.
The initial rules for Sum Delete should be read with this in mind: it sounds reasonable, but there's no reason to trust that it can make a puzzle at all, let alone a good one. Also, unsurprisingly, the provided puzzle isn't solvable (the 25 column requires use of the 9 in the first row).
Similarly, I'd love a critical analysis of the initial code. Did it guarantee solvability? An awful lot can be swept under "improving the design".
You guys are funny. An AI generates a game, comes up with the rules, writes the code and designs the web page for it. Your reactions:
- Bah, it's not very fun.
- It's been done before.
- It took too long to make.
Seriously. Let me repeat that. An AI generates a game. It comes up with the rules for the game. It even writes the code and designs the web page for it!
The "it's been done before" one is pretty relevant. It means the model didn't actually generate the game, but likely pulled it more or less straight out of its training data. It's still very cool that you can ask it for something and it can basically mine the entire (2021) internet for it, but it's not the same as being able to create something really new.
I've noticed the same thing testing it on various coding questions. It's extremely good at problems that have solutions online. And given stackoverflow, that's a lot of problems. If you manage to hit it with something that it hasn't seen before though, even if it's conceptually very straightforward, it tends to just generate a mix of boilerplate and nonsense.
Exactly. When the first news came out about it's ability to "understand" code, find bugs and improve uppon it, I tested it with some snippets of mine. It just gave boilerplate best practices you find on 100 of blogs, but was not able to make meaningful contribution. It claimed to have introduced a feature while only having found another way to write the same snippet. On other things it straight up invented variables & functions that didn't exist.
As long as the task is in it's training set, it can give you a decent answer, but it can't code it just mimics doing so...
>The "it's been done before" one is pretty relevant.
But is it? 99.999% of software development has been done before. Even if you do something that is legitimately new (like creating a chatbot that can generate code on demand). Then your solution will still contain more than 99% code that is just a repeat of things that have already been done.
That's not my experience at all. Copilot consistently creates implementations that are very specific to my app and manages to understand the context and problem surface spanning many files. It's not just getting a standard problem and pulls an answer from Stack Overflow.
Even if the rules were inspired by some text that's on the internet rather than a genuine invention (we'll never actually know, we're all just speculating): it hasn't "pulled it out of its training data".
To be asked in plain, simple (ish) English to invent a game, produce code for it and then style it etc and the few other bits the author asked for _is_ impressive.
Why are we asking for so much? Remember the chatbots of the mid-2000s? Eliza etc? They were impressive for the time but GPT represents a _huge_ improvement in this stuff. Of course it's not perfect, but it's an exhilarating jump in capabilities.
But it hasn't! This is just another step in the BS storm coming out of the latest AI hype. The language model has reproduced something that has existed before and was likely part of its training data. That's cool, but it's far from what's being claimed here.
We really need to get better at fact checking this stuff. And with "this stuff" I mean the output of LLMs and other AI frameworks as well as the claims about it. And with "we" I mean society as a whole and our industry in particular. Let's keep the hype in the drawer. The general population can be hyped up about sth, but we should know better, so instead of joining the hype, let's keep a cool head and educate people about what this is and what it isn't.
The second point on your reactions “It’s been done before” is very crucial.
That defeats the point of your argument that “AI generates a game. It comes up with rules for the game”.
No, it doesn’t. It plagiraised the game and pretended to come up with it. It just used a random puzzle game that it had on its training set.
It’s like asking it to write a poem and getting the same exact poem from a random google search. It didn’t come up with it. It just copied it. It’s not as amazing as you say it is.
Also if you look I comments you can see that it’s not even just one game. There are several games that are exactly like that. Which means more probability of having it in a training set.
> It’s like asking it to write a poem and getting the same exact poem from a random google search.
No, it's not. A better comparison would be a poem that feels the same as an existing one and using the same prose but with its own words. Or any musical plagiarism dispute where the song is clearly different but similar enough that it needs to be decided by court. ChatGPT is not just copypasting a puzzle game here.
I worry that a lot of otherwise brilliant developers are going to get blindsided by this stuff.
The current models are impressive in strong, quantifiable ways. They are only going to become more powerful from this point.
Consider the current state of affairs: ChatGPT supports a 4K context size. Leaked foundry pricing indicates models that can handle 32K context size. 32K tokens is enough for your entire brand manual or several days worth of call center transcripts. Many products could have the most important parts of their codebase completely loaded into just the prompt.
I would say you should at least try the OpenAI playground (or equivalent technology) to understand what is possible right now. I had no clue where we were at until ~3 weeks ago. I wouldn't wait until 2024 on this one anymore.
Agreed. LLMs are on par with the invention web or the smartphone in terms of how much impact they'll have (possibly more). It's weird to see so many HNers being so dismissive of them. I've been using ChatGPT daily (mostly to ask programming related questions) and it's like having a new super power.
I know its getting popular, but I really doubt 90% of software development is generated from trained models...
I don't understand what's so at stake with this that you feel like people are afraid. It's fun and amazing it can spit out stuff like this, and if you are a good developer experimenting with this stuff you already know its inarguably a novel and useful utility, if still limited in some ways.
But where is the fire? Why does everything got to devolve into one vague culture war or another? Shouldn't you welcome good faith critique? If only for the fact that these things can still be improved, and how can you hope to improve them if you smother and dismiss every suggestion that these models might be less than perfect.
It is fairly typical of HN to err on the side of cynicism.
Correct me if I'm wrong, but ChatGPT is a very fancy auto-complete function. It's has no ability to create from scratch, just the ability to recompile and recontextualise any of the many existing pieces it has in its library.
It's unlikey that this game or its rules are truely original, ChatGPT will have just plucked it from the library, perhaps given it a new name.
I don't see how your comment contributes to the discussion. It seems aimed at shutting it down only allowing praise.
The scope of the creation and whether it actually produced something novel is quite important to the discussion and part of the claim (although the author is very open to be proven wrong, in the article).
Your claim n the second to last paragraph is false. That's relevant. This is HN.
What we're effectively seeing is that when someone demonstrates a talking horse, some people will complain that the horse speaks with a horrible accent and uses impolite language.
How do you relate: "An AI generates a game", and then "It's been done before." Obviously it didn't generate a game but it copied, which is not "amazing".
Very cool! Although I would love to be proven wrong, I am still suspicious that this is actually a code sample from an existing game it picked up somewhere during training. Seeing how chatgpt struggles with basic logic I would be surprised if it can actually generate a new and workable game of logic.
The way I understand these models (and please correct me if I'm wrong) is that they predict every single word one at a time out of billions and billions of possible paths. So for the model to actually reproduce code you would either need to bait it really hard (reducing the possible paths by pushing it into a corner) or be impossibly "lucky". It can't just copy-paste something accidentally since it has no "awarenesses" of the source material the last neuron was created from.
It is trivial to encode text in neural networks, these models try hard to avoid that by punishing it in training but they still encode a lot of text word for word. The most famous example is "Fast inverse square root", it gives you the exact same thing with same comments etc.
For instance, I once found what I thought was an ingeniously original idea about about how TV is really just a kind of reflection of reality akin to Plato's Cave. I immediately got started writing a thesis about it, but I didn't have to search for long on the topic before I found an entire book written on this way of thinking about television. I wasn't really disappointed, because in the back of my head I knew that it had too be too good to be true that I'd be first with such a great idea. In any case I kept working with the thesis, and I still did got a good grade on it despite the idea not being revolutionary.
The questions I now wonder about is, can ChatGPT forget? Or could it be that ChatGPT was never exposed to this game, but could still infer it through other game rules, such as those for Soduko? Which I guess opens up another rabbit hole on if or how AI can be creative. Which I guess opens up another rabbit hole on how creativity works in general.
Ironically, it seems to me that you are anthropomorphizing ChatGPT a bit too much here. It has no reason to lie so I think it's more likely that it just doesn't know such game exists. It probably came up with it independently or doesn't have a strong memory of it. In some respect, it would be even more impressive if it was actually "lying through its teeth" because it would imply the AI had some kind of hidden agenda.
Anecdotal, but I've not yet had any success in producing any non-trivial code with ChatGPT. It has, however, produced copious amounts of bullshit with plausible variable names... :)
That game is NOT the same game. It's similar but the games are different.
I get people are excited about a chatbot which doesn’t suck, but ideally it wouldn’t turn off critical thinking skills.
[1]: https://news.ycombinator.com/item?id=34744921
Seems to be similar to a game called Kakuro. This [1] repo even contains a similar rule:
> The algorithm exceed the rules that the sum over a row must equal to the value on the left and the sum over a column must be equal to the value on the bottom of the cells with the diagonal and one or two numbers
[1]: https://github.com/MarioBonse/KakuroSolverCSP
[2]: https://github.com/topics/kakuro
Loose quote from I don't remember who, early 90s.
[1] https://trends.google.com/trends/explore?q=Sumplete (search "Worldwide" and extend the time range)
Would be nice if GPT could dump the source of how it came to such a solution, if it generated the game by random chance via combining various unrelated chunks of text and mixing up the rules, or if it used some text describing the game you linked.
Your game involves addition. chatGPT is using subtraction.
I think this is an interesting note at the end, it's clear that the whole "conversation" hasn't been posted, and not clear how many prompts were needed to finish it. A proficient developer would be able to develop this same app to the same level (once they have the idea) in a couple of hours without ChatGPT too.
It would be super interesting to see a full screen recording of the process (or a similar one). What was the total word count of all prompts, how does that compare to the code size?
Don't get me wrong, this is incredible and inspiring (I'm regularly using ChatGPT and Copilot). But I think the post doesn't critically analyse the process well enough.
I mean... the whole point is that you don't need to be that proficient to use ChatGPT.
If ChatGPT is much better than a proficient developer, you and I wouldn't be surfing HN, but on the street protesting for universal income.
It was quite amazing how much actual manual work was going into some (not all) of those images in the end.
A lot of processing in photoshop etc. I actually started to think that really, he was doing art, just not using a brush.
Would it be noteworthy if it this game (whether or not it existed before) were designed and implemented by someone who has very limited experience with software development in a few hours? Well, the fact that ChatGPT did this means the former is now a very real possibility.
That you can go from conceptual inception of a framework to a fairly complete product in a few hours with very little experience is a big deal.
That it also coded it up itself is pretty amazing, but it's overshadowed by the false more amazing claim that it invented it.
I haven't tried to replicate the author's journey, but in my experience it requires a considerable amount of hand-holding. e.g. Functions will be stubbed out, but contain no logic.
I was able to guide it through a "playable" flappy bird, bit it took several revisions where I pointed out what was wrong/needed to be done before it truly returned a functional error-free prototype.
It felt like pair-programming with a promising and apologetic junior dev.
Because the game has existed for centuries. ChatGPT claiming ownership is nothing new, either. The current top post even links an android app that does the same.
i) This exact puzzle predates the app ii) A description of the rules appears in the training set
Lichess and chess.com both exist. does the existence of one make the other worthless? unimpressive?
The "Labyrinth Sudoku" description feels like a classic language model speciousness. It doesn't actually work: you can't fit the digits 1-9 if you can't use all the cells, and it hasn't modified sudoku rules in a way that makes paths relevant. Maybe you can come up with a way to salvage it, but ChatGPT didn't.
The initial rules for Sum Delete should be read with this in mind: it sounds reasonable, but there's no reason to trust that it can make a puzzle at all, let alone a good one. Also, unsurprisingly, the provided puzzle isn't solvable (the 25 column requires use of the 9 in the first row).
Similarly, I'd love a critical analysis of the initial code. Did it guarantee solvability? An awful lot can be swept under "improving the design".
- Bah, it's not very fun.
- It's been done before.
- It took too long to make.
Seriously. Let me repeat that. An AI generates a game. It comes up with the rules for the game. It even writes the code and designs the web page for it!
Come on! This is amazing!
I've noticed the same thing testing it on various coding questions. It's extremely good at problems that have solutions online. And given stackoverflow, that's a lot of problems. If you manage to hit it with something that it hasn't seen before though, even if it's conceptually very straightforward, it tends to just generate a mix of boilerplate and nonsense.
As long as the task is in it's training set, it can give you a decent answer, but it can't code it just mimics doing so...
But is it? 99.999% of software development has been done before. Even if you do something that is legitimately new (like creating a chatbot that can generate code on demand). Then your solution will still contain more than 99% code that is just a repeat of things that have already been done.
To be asked in plain, simple (ish) English to invent a game, produce code for it and then style it etc and the few other bits the author asked for _is_ impressive.
Why are we asking for so much? Remember the chatbots of the mid-2000s? Eliza etc? They were impressive for the time but GPT represents a _huge_ improvement in this stuff. Of course it's not perfect, but it's an exhilarating jump in capabilities.
But it hasn't! This is just another step in the BS storm coming out of the latest AI hype. The language model has reproduced something that has existed before and was likely part of its training data. That's cool, but it's far from what's being claimed here.
We really need to get better at fact checking this stuff. And with "this stuff" I mean the output of LLMs and other AI frameworks as well as the claims about it. And with "we" I mean society as a whole and our industry in particular. Let's keep the hype in the drawer. The general population can be hyped up about sth, but we should know better, so instead of joining the hype, let's keep a cool head and educate people about what this is and what it isn't.
That defeats the point of your argument that “AI generates a game. It comes up with rules for the game”.
No, it doesn’t. It plagiraised the game and pretended to come up with it. It just used a random puzzle game that it had on its training set.
It’s like asking it to write a poem and getting the same exact poem from a random google search. It didn’t come up with it. It just copied it. It’s not as amazing as you say it is.
Also if you look I comments you can see that it’s not even just one game. There are several games that are exactly like that. Which means more probability of having it in a training set.
No, it's not. A better comparison would be a poem that feels the same as an existing one and using the same prose but with its own words. Or any musical plagiarism dispute where the song is clearly different but similar enough that it needs to be decided by court. ChatGPT is not just copypasting a puzzle game here.
I worry that a lot of otherwise brilliant developers are going to get blindsided by this stuff.
The current models are impressive in strong, quantifiable ways. They are only going to become more powerful from this point.
Consider the current state of affairs: ChatGPT supports a 4K context size. Leaked foundry pricing indicates models that can handle 32K context size. 32K tokens is enough for your entire brand manual or several days worth of call center transcripts. Many products could have the most important parts of their codebase completely loaded into just the prompt.
I would say you should at least try the OpenAI playground (or equivalent technology) to understand what is possible right now. I had no clue where we were at until ~3 weeks ago. I wouldn't wait until 2024 on this one anymore.
Nevermind that this perfectly describes 90% of software development.
I'm actually wondering to what extent these responses are fueled by fear of being replaced by AI.
I don't understand what's so at stake with this that you feel like people are afraid. It's fun and amazing it can spit out stuff like this, and if you are a good developer experimenting with this stuff you already know its inarguably a novel and useful utility, if still limited in some ways.
But where is the fire? Why does everything got to devolve into one vague culture war or another? Shouldn't you welcome good faith critique? If only for the fact that these things can still be improved, and how can you hope to improve them if you smother and dismiss every suggestion that these models might be less than perfect.
Correct me if I'm wrong, but ChatGPT is a very fancy auto-complete function. It's has no ability to create from scratch, just the ability to recompile and recontextualise any of the many existing pieces it has in its library.
It's unlikey that this game or its rules are truely original, ChatGPT will have just plucked it from the library, perhaps given it a new name.
"Great artists steal".
Art is defined by remixing the life experience of the author.
ChatGPT's ability to create art is only limited by it's input (text corpus) while humans have images, sound, smell, touch, etc.
The scope of the creation and whether it actually produced something novel is quite important to the discussion and part of the claim (although the author is very open to be proven wrong, in the article).
Your claim n the second to last paragraph is false. That's relevant. This is HN.
For such an obscure game I would guess it came up with the rules. It’s difficult to prove though.
On a flip note: the game is fun enough.
Dead Comment
[0] https://www.puzzle-kakurasu.com/
Even if it was, what would we search for?
I was hoping someone would find a reference on wikipedia or something, which would clearly be part of the training set.
https://imgur.com/a/0k9aXGZ