This is such a great demo. The original used Box2D,LUA scripting, and of course you had to make enemies and levels.
There's obviously no expectation that you'd make a hit game from the tech in its current state. You're bound to be limited by the tech, rather than your own skills.
But for rapid ideas, for prototypes, for game jams, this is a game changer. I can also see it as a great alternative to Scratch for kids to play around with ideas. Hope to see more platforms try to turn this into an offering!
Would derail the thread pretty hard and I'm not sure even which one to pick. But my favorite memory was walking the streets of Seoul and getting in a little street market. There was this kid who was the son of the shop owner, playing on a cheap Android device. Super into the last update we shipped. You could tell he was gonna sit there on the floor in the corner and this would be his day. I was suddenly so self conscious about how we made levels and updates. Until then it was just the next update at the office, we had to make it good and polished and respectful of the player. But now it was this kid's world, much like Super Mario Bros had been to me. It was important. It was a really humbling moment.
I did a similar exercise recently when I needed to make a fairly basic rest API and CRUD frontend using 2 frameworks I wasn't particularly familiar with. I used GPT4 to generate ALL the code for it. I'll write a blog post about it soon, but a quick overview was:
I suspect it was slower than just writing the code/referencing the docs, and would be much slower than someone could do if they were experienced with the two frameworks. I had to be very specific and write a few long and detailed prompts for the more complex parts of a the application. It took around 5 hours to make the application, with a lot of that time spent sitting waiting for the (sometimes painfully slow) ChatGPT output. In a framework I'm more familiar with I think I could have easily got it done in under 2 hours
It was definitely useful for making sure I was doing it the correct way, kind of like have an expert on call for any questions. It was also very useful for generating perfectly formatted boilerplate code (some frameworks have CRUD generation built in, but this one did not).
It was a fun experiment, and I found it useful as a learning/guiding/generation tool, but I won't be using it for general day to day development any more than I currently do. For most instances it's quicker to just learn the framework well and write the code yourself.
> It was definitely useful for making sure I was doing it the correct way, kind of like have an expert on call for any questions.
I've found it to be shockingly good at this. I end up asking a lot of questions like, "what is the best directory structure for a project on {foo} platform?" or "What is the idiomatic way to do {x} in {language y}?" It has the advantage of having seen lots of projects in every language, and for some questions that automatically leads to a good answer.
Be careful and don't trust all it says. Sometimes it invents API functions which are not there, or doesn't see existing. And always very confident till you point it.
I always get the classic
“It really depends on your use case and neither pattern is exactly better than the other” when asking gpt about programming patterns
Conversely, doing so has helped me flesh out my thoughts on many occasions. As I ran into obstacles with errors or imprecise prompting, I realized my design had issues or edge cases I hadn’t take into account. Perhaps it would be better if I wrote out several paragraphs describing my intentions before taking up most coding tasks, but I hardly think my boss would be in support of this!
Yes exactly. I had to be very specific and tackle the project in the same way I would if I was fully writing the code. First data schema, then models, then controllers with CRUD and views, then routes, then authentication, then test cases, then TS transpiling, etc...
It's definitely not something someone with zero coding experience could easily do, and I feel even a junior developer would struggle with anything even as complex as this. You have to have the concept and structure in your head, it's just writing the code for it.
That's a great fit for scenarios where you do know programming but don't know the particular language and framework on which you suddenly have to do some maintenance or improvement.
For many things the available documentation is poor, and asking a bot is much more helpful.
What I think you're overlooking is that most people can only do a few hours of hardcore coding at peak productivity a day (3hours for me, maybe)
So you could spend 3hours babysitting GPT4 to write some code for you, but then you'd still have 3 hours of peak code productivity that day that you haven't "used up" yet
I’m the opposite, personally. I can code for 5 or 6 hours just fine if I’m “in the zone”, but I can’t deal with LLMs for more than an hour or two max, usually less. I find their sweet spot is when I need to ask one or two questions with one or two follow-ups, 5-10 minutes ideally. They can sometimes be a big win in small doses if you can keep it to that kind of interaction. But they are just draining to interact with for longer periods. For me they’re a lot like being a TA stuck in multi-hour office hours with undergrads once you get past a few questions. Just a really shitty slog.
It’s thinking either way. I would even wager that the trivial code that GPT writes may be easier to read for me, than some convoluted, human language description of the same thing, done with numerous corrections at every point.
The relative uniformity of code is a positive for human understanding as well, e.g. a^2+b^2=c^2 is easier to parse/transmit the idea over any “spoken” version of the same thing.
True. Although I did find the writing of prompts to be quite exhausting due to tedium and waiting for the output to be frustrating, so that would dig into my energy for peak coding. I would say it uses less concentration, but about the same amount or even more effort. But also it was a very narrow test, maybe for certain things (especially repetitive code or boilerplate code) it could be very beneficial.
This is more interesting than the deluge of posts that say "I created an iOS app in 30 minutes using ChatGPT!" Which doesn't mean much because it could've done nothing more than create a simple hello world.
This one at least shows the finished product, which is indeed pretty impressive.
Some details I'd need to know are (a) how long did it take, (b) how many prompts, (c) how many course-corrections were required, and (d) how competent this individual was with the technologies in question.
I've personally found ChatGPT extremely empowering in lots of scenarios, but code generation was not among them.
> Although the game is just 600 lines of which I haven't written ANY, [coding the game] was the most challenging part
Not quite hello world, but not too much more difficult than a shopping list. The really impressive thing to me is you can make angry birds with just 600 loc (and a couple libraries)
My guess is that the main parts of the game are physics (collisions etc) and the scoring system, so that part wasn't too surprising to me.
I was pleasantly surprised at the visual quality, I knew Midjourney could produce quality graphics assets, but I guess I didn't realize how easy it was to pull into a game.
I’ve been playing with ChatGPT code generation to make entire sites with flask, python, html+js+css, backed with SQLite db and it’s amazing. I’ve had it write like 5k lines that are all live in prod and working (not much traffic lol but still).
A huge huge factor is knowing the limitations and getting better at prompting. And identifying likely hallucinations and asking for risks etc.
I’ve found it best with tech I don’t know well (I’m an android dev using it to make websites, something I haven’t done myself in like 15 years).
Most of the coolest stuff for me is help with sysadmin and running the server. The ability to debug gunicorn errors is great.
I do have to modify the code it outputs as the project grows and it loses context, but honestly the context limits are the biggest hurdle for bigger projects and those will be lifted soon.
Edit: Most recent site I made with like 95% code from ChatGPT is https://cosmictrip.space/ which generates prompts with GPT-4 that are then used to generate space images with DALL-E.
It's a simple site but there is a secret adventure game I'm working on (GPT+Dall-E) that is open-ended image+text AI-driven game. I'm hoping to launch before Nov 6 with DALL-E 3 API (hopefully...!). The adventure game is also written like 95%+ by ChatGPT.
I've had such great success with it coding that I'm using the GPT-4 API with an agent I'm making (everyone is huh). I have function calling hooked up to generate structured subtasks that the agent can then write the code for, and support for files to include context, chat with your code, etc. It's not ready to show but the GPT-4 code generation abilities are really incredible - but you have to be experienced at prompting. Your first prompts aren't likely to be great, which is why I'm hoping my agent can have success. The idea of the agent I'm writing is a Jira/kanban style board where you have AI coders assigned to tasks that you can approve and modify etc. The tickets should automatically move across the columns as the AI checks the work etc.
+1 for its suitability in helping with systems administration.
One responsibility at my current job is administering a Windows server and trying to get it to do things that are easy on a Unix -- that should be easy anywhere -- but, on Windows, seem to inevitably degrade into nightmares. ChadGPT has given me huge amounts of blessed ammo to shoot at the nightmares, and there's no way I could do that portion of the job in a feasible time frame without it.
adventure game I'm working on (GPT+Dall-E) that is open-ended image+text AI-driven game. I'm hoping to launch before Nov 6 with DALL-E 3 API.
Some people have hooked AI dungeon / koboldAI up to stable diffusion to generate these kinds of procedural Ender's game style interactive graphical text adventures with varying degrees of success.
If your game is going to be similar, you'd better get in the habit of aggressively caching the generated imagery for it on S3 because no way the DALL-E 3 API is going to be cheap.
You are right about the context window limitation.
I exclusively use Azure OpenAI
GPT-4 32k version and it's been a game changer when coding on complex projects.
I feel that trapping your AI agents in a kanban board isn't going to do your survival chances a lot of good when the robot apocalypse inevitably comes for us meatbags.
Not only are there tons of Angry Birds clones (Angry Birds itself is kind of a clone of earlier games), there are also tons of step-by-step tutorials for making them, which were no doubt included in the training data.
GPT4 is great at this stuff, but iterative refinement doesn’t work in my experience.
As the conversation increases, the previous context is lost and the generated code deviates from its previous behaviour.
For example, “fix this bug…” can easily result in a solution that breaks some other thing. You can also see code generated in thread (1) that does exist in the final result (2), suggesting that (since this is the very top of the code), they were getting chatGPT to iteratively generate 600+ line segments.
I severely doubt this.
Creating a new Slingshot on line 20 after it is defined on line 500? That is extraordinarily unlikely unless you specifically prompted it to do that.
“loadImage('stone2.png');”, it just happened to pick the right file names? The right sprite sizes? You provided all that in a prompt and it wrote the code? Come onnnn… show us the actual prompt you used.
It seems much more likely they generated a set of class objects relatively independently, then manually assembled them into a larger file, copied the entire thing as input and then crafted a “code prompt” like “write a function that does such and such”.
It’s not impossible they used prompts like they claim (3), but I feel they are (for likes and cred) vastly overstating the “it did all the coding” part of this project.
I feel they probably hand wrote some of the code (or assembled it) and used it as input + a “now do this also” style prompt, so the output was “100% generated”, but not in the way people are assuming.
This approach tends to make GPT4 rewrite the existing code, but unless you specifically ask for (or add) comments describing the intent through out the code (missing in most of the generated code), it will drift from the previous functionality. With no test suite to verify, you won’t notice this subtle drift and things just break. There’s no mention of either of these things being done by the author.
Further more, this user has a vested interest (4) in selling training materials for AI, so it’s in their interest to appear to be an expert at this, and has provided (even when asked on X) no additional details, no “step by step” git repo with history, no actual prompts they’ve used.
Given the lack of details and the frankly unbelievable results, I think there’s fair call to be sceptical in this case.
You could generate this kind of thing from models such as codellama 34B, or GPT 3.5; but not using the method as described.
I’m… not convinced you could do it with gpt4. The prompts seem too stupid to be real (5)… but I happy to be proved wrong with more details. GPT4 is good.
Context is 8k and it's quadratic. It "sees" everything in that window. If you want to have a long conversation try Claude or some of the 32k models. Claude uses a strange kind of attention that isn't always as precise but it's very good at finding key information in huge documents.
Think it could be an interesting UX pattern. Having interactive loading (spinner) games that at least give is feedback that our actions (even in between things) have impact.
It is an interesting approach to loading screens, and personally I would have expected way more games to use such a feature. Not AAAs, of course, but indie games.
That AI is transformative for development is not in doubt any more. Just this past week, I've been able to build two medium sized services (a couple of thousand lines of code in python, a language I hadn't used for more than a decade!). What's truly impressive is that for the large part, it's better than the code I'd have written anyway. Want a nice README.md? Just provide the source code that contains routes/cli args/whatever, and it'll generate it for you. Want tests? Sure. Developers have never had it so easy.
One thing to note is that for code generation, GPT4 runs circles around GPT3.5. GPT35 is alright at copying if you provide very tight examples, but GPT4 kinda "thinks".
Another piece of information from experience - GPT4 32k contexts fail quite often. So if you're generating let's say 10k tokens or more (around 30k characters), you'd have to give it a few tries. Another, ChatGPT is not the ideal interface for non-trivial work. You should use the API directly, or use something like Azure OpenAI Chat Playground which lets you use 32k contexts.
I find it interesting that over the past decade so much investment has gone into making no code tools, and now ChatGPT is so good at writing code that it’s probably faster, more flexible and approaching the same level of usability for technically minded but non coding type folks.
I recently had to create a demo app to consume and publish a REST service using Mendix and it took a couple of days to figure out all the details, but doing the same thing in any language (bash for example) using ChatGPT would have taken minutes.
Deployment and version control can be solved without much technical prowess using PaaS/IaaS, especially if you’re comparing your costs with enterprise no code platforms.
It may be my personal bias talking (I’ve always disliked no code platforms because they feel more cumbersome when you have to do anything serious, I dislike ActiveRecord ORMs for similar reasons) but it kind of seems like No Code will be obsolete pretty soon.
Who wants to drag and drop when you can just ask, copy and paste?
I think this solution is about the perfect thing I can come up with, conceptually. Nocode is easy but rigid. Coding is flexible but tedious and error prone.
Being able to talk out what you want to quickly get the code, as long as it's clean, gives you the flexibility to then tune as needed. And in some cases, like apparently this one, that wasn't even necessary. Exciting times ahead...
I guess the thing that's still lacking from an AI code solution is security guard rails. My demo below for example is open to injection attacks but I think that could be solved with fine tuning or custom instructions.
You can't just have amateurs copying and pasting stuff into production environments, but even the task of writing tests can be given to an AI. Like you get one AI to write some code, then you get another AI to write code to test it. The chances of perfectl complementary errors are pretty low, but even if that happens you then get an AI to write a frontend script to run automated integration tests, and then have some human quality control at the end.
Really, I think a code-based AI pipeline has much more long term potential than no code does. The interfaces are just so laborious.
This statistical plagiarism laundering is pretty neat.
IMHO, stopping the laundering gold rush is a more urgent priority for law, than creating market moats for the current big pickaxe vendors and pretending it's about preventing HAL.
There have got to be some freelancers/remote workers who have 100x'ed their productivity using GPT-4 and AI tools correctly. I can't imagine all these cool hacks exist in a vacuum. Imagine what we'll have in 2 years. The genie is out of the bottle.
This is such a great demo. The original used Box2D,LUA scripting, and of course you had to make enemies and levels.
There's obviously no expectation that you'd make a hit game from the tech in its current state. You're bound to be limited by the tech, rather than your own skills.
But for rapid ideas, for prototypes, for game jams, this is a game changer. I can also see it as a great alternative to Scratch for kids to play around with ideas. Hope to see more platforms try to turn this into an offering!
Deleted Comment
Deleted Comment
Dead Comment
I suspect it was slower than just writing the code/referencing the docs, and would be much slower than someone could do if they were experienced with the two frameworks. I had to be very specific and write a few long and detailed prompts for the more complex parts of a the application. It took around 5 hours to make the application, with a lot of that time spent sitting waiting for the (sometimes painfully slow) ChatGPT output. In a framework I'm more familiar with I think I could have easily got it done in under 2 hours
It was definitely useful for making sure I was doing it the correct way, kind of like have an expert on call for any questions. It was also very useful for generating perfectly formatted boilerplate code (some frameworks have CRUD generation built in, but this one did not).
It was a fun experiment, and I found it useful as a learning/guiding/generation tool, but I won't be using it for general day to day development any more than I currently do. For most instances it's quicker to just learn the framework well and write the code yourself.
I've found it to be shockingly good at this. I end up asking a lot of questions like, "what is the best directory structure for a project on {foo} platform?" or "What is the idiomatic way to do {x} in {language y}?" It has the advantage of having seen lots of projects in every language, and for some questions that automatically leads to a good answer.
This is my experience. You still have to understand programming: you're just typing it out in Natural English.
It's definitely not something someone with zero coding experience could easily do, and I feel even a junior developer would struggle with anything even as complex as this. You have to have the concept and structure in your head, it's just writing the code for it.
For many things the available documentation is poor, and asking a bot is much more helpful.
So you could spend 3hours babysitting GPT4 to write some code for you, but then you'd still have 3 hours of peak code productivity that day that you haven't "used up" yet
The relative uniformity of code is a positive for human understanding as well, e.g. a^2+b^2=c^2 is easier to parse/transmit the idea over any “spoken” version of the same thing.
Do you use the ChatGPT Plus version or the API? If the API, what do you usually use to access it?
This one at least shows the finished product, which is indeed pretty impressive.
Some details I'd need to know are (a) how long did it take, (b) how many prompts, (c) how many course-corrections were required, and (d) how competent this individual was with the technologies in question.
I've personally found ChatGPT extremely empowering in lots of scenarios, but code generation was not among them.
> Although the game is just 600 lines of which I haven't written ANY, [coding the game] was the most challenging part
Not quite hello world, but not too much more difficult than a shopping list. The really impressive thing to me is you can make angry birds with just 600 loc (and a couple libraries)
I was pleasantly surprised at the visual quality, I knew Midjourney could produce quality graphics assets, but I guess I didn't realize how easy it was to pull into a game.
A huge huge factor is knowing the limitations and getting better at prompting. And identifying likely hallucinations and asking for risks etc.
I’ve found it best with tech I don’t know well (I’m an android dev using it to make websites, something I haven’t done myself in like 15 years).
Most of the coolest stuff for me is help with sysadmin and running the server. The ability to debug gunicorn errors is great.
I do have to modify the code it outputs as the project grows and it loses context, but honestly the context limits are the biggest hurdle for bigger projects and those will be lifted soon.
Edit: Most recent site I made with like 95% code from ChatGPT is https://cosmictrip.space/ which generates prompts with GPT-4 that are then used to generate space images with DALL-E.
It's a simple site but there is a secret adventure game I'm working on (GPT+Dall-E) that is open-ended image+text AI-driven game. I'm hoping to launch before Nov 6 with DALL-E 3 API (hopefully...!). The adventure game is also written like 95%+ by ChatGPT.
I've had such great success with it coding that I'm using the GPT-4 API with an agent I'm making (everyone is huh). I have function calling hooked up to generate structured subtasks that the agent can then write the code for, and support for files to include context, chat with your code, etc. It's not ready to show but the GPT-4 code generation abilities are really incredible - but you have to be experienced at prompting. Your first prompts aren't likely to be great, which is why I'm hoping my agent can have success. The idea of the agent I'm writing is a Jira/kanban style board where you have AI coders assigned to tasks that you can approve and modify etc. The tickets should automatically move across the columns as the AI checks the work etc.
One responsibility at my current job is administering a Windows server and trying to get it to do things that are easy on a Unix -- that should be easy anywhere -- but, on Windows, seem to inevitably degrade into nightmares. ChadGPT has given me huge amounts of blessed ammo to shoot at the nightmares, and there's no way I could do that portion of the job in a feasible time frame without it.
Some people have hooked AI dungeon / koboldAI up to stable diffusion to generate these kinds of procedural Ender's game style interactive graphical text adventures with varying degrees of success.
If your game is going to be similar, you'd better get in the habit of aggressively caching the generated imagery for it on S3 because no way the DALL-E 3 API is going to be cheap.
Reading good prompting is probably one of the better ways of learning how to do it.
Programming a new game without dozens of existing templates would be a better litmus test.
GPT4 is great at this stuff, but iterative refinement doesn’t work in my experience.
As the conversation increases, the previous context is lost and the generated code deviates from its previous behaviour.
For example, “fix this bug…” can easily result in a solution that breaks some other thing. You can also see code generated in thread (1) that does exist in the final result (2), suggesting that (since this is the very top of the code), they were getting chatGPT to iteratively generate 600+ line segments.
I severely doubt this.
Creating a new Slingshot on line 20 after it is defined on line 500? That is extraordinarily unlikely unless you specifically prompted it to do that.
“loadImage('stone2.png');”, it just happened to pick the right file names? The right sprite sizes? You provided all that in a prompt and it wrote the code? Come onnnn… show us the actual prompt you used.
It seems much more likely they generated a set of class objects relatively independently, then manually assembled them into a larger file, copied the entire thing as input and then crafted a “code prompt” like “write a function that does such and such”.
It’s not impossible they used prompts like they claim (3), but I feel they are (for likes and cred) vastly overstating the “it did all the coding” part of this project.
I feel they probably hand wrote some of the code (or assembled it) and used it as input + a “now do this also” style prompt, so the output was “100% generated”, but not in the way people are assuming.
This approach tends to make GPT4 rewrite the existing code, but unless you specifically ask for (or add) comments describing the intent through out the code (missing in most of the generated code), it will drift from the previous functionality. With no test suite to verify, you won’t notice this subtle drift and things just break. There’s no mention of either of these things being done by the author.
Further more, this user has a vested interest (4) in selling training materials for AI, so it’s in their interest to appear to be an expert at this, and has provided (even when asked on X) no additional details, no “step by step” git repo with history, no actual prompts they’ve used.
Given the lack of details and the frankly unbelievable results, I think there’s fair call to be sceptical in this case.
You could generate this kind of thing from models such as codellama 34B, or GPT 3.5; but not using the method as described.
I’m… not convinced you could do it with gpt4. The prompts seem too stupid to be real (5)… but I happy to be proved wrong with more details. GPT4 is good.
[1] - https://nitter.net/pic/orig/media%2FF9xoI8mXgAAn7v9.jpg [2] - https://bestaiprompts.art/angry-pumpkins/sketch.js [3] - https://nitter.net/javilopen/status/1719363669685916095#m [4] - https://javilopen.substack.com/ [5] - “Now, make the monsters circular, and be very careful: apply the same technique that already exists for the rectangular ones regarding scaling and collision area, and don't mess it up like before. ”
https://spinner.franzai.com/
Think it could be an interesting UX pattern. Having interactive loading (spinner) games that at least give is feedback that our actions (even in between things) have impact.
I clearly recalled having read the news about the patent of this having expired a while ago, and from a quick search, a while ago, has been 8 years ago https://www.eff.org/deeplinks/2015/12/loading-screen-game-pa...
Deleted Comment
One thing to note is that for code generation, GPT4 runs circles around GPT3.5. GPT35 is alright at copying if you provide very tight examples, but GPT4 kinda "thinks".
Another piece of information from experience - GPT4 32k contexts fail quite often. So if you're generating let's say 10k tokens or more (around 30k characters), you'd have to give it a few tries. Another, ChatGPT is not the ideal interface for non-trivial work. You should use the API directly, or use something like Azure OpenAI Chat Playground which lets you use 32k contexts.
Shameless plug: I have this open source app which automates grunt work in prompt generation - https://github.com/codespin-ai/codespin-cli
I recently had to create a demo app to consume and publish a REST service using Mendix and it took a couple of days to figure out all the details, but doing the same thing in any language (bash for example) using ChatGPT would have taken minutes.
Deployment and version control can be solved without much technical prowess using PaaS/IaaS, especially if you’re comparing your costs with enterprise no code platforms.
It may be my personal bias talking (I’ve always disliked no code platforms because they feel more cumbersome when you have to do anything serious, I dislike ActiveRecord ORMs for similar reasons) but it kind of seems like No Code will be obsolete pretty soon.
Who wants to drag and drop when you can just ask, copy and paste?
Being able to talk out what you want to quickly get the code, as long as it's clean, gives you the flexibility to then tune as needed. And in some cases, like apparently this one, that wasn't even necessary. Exciting times ahead...
You can't just have amateurs copying and pasting stuff into production environments, but even the task of writing tests can be given to an AI. Like you get one AI to write some code, then you get another AI to write code to test it. The chances of perfectl complementary errors are pretty low, but even if that happens you then get an AI to write a frontend script to run automated integration tests, and then have some human quality control at the end.
Really, I think a code-based AI pipeline has much more long term potential than no code does. The interfaces are just so laborious.
https://chat.openai.com/share/f98d04b9-6d93-46b7-9fa9-7c9ec1...
IMHO, stopping the laundering gold rush is a more urgent priority for law, than creating market moats for the current big pickaxe vendors and pretending it's about preventing HAL.