Readit News logoReadit News
Posted by u/z991 2 years ago
Show HN: A Dalle-3 and GPT4-Vision feedback loopdalle.party/...
I used to enjoy Translation Party, and over the weekend I realized that we can build the same feedback loop with DALLE-3 and GPT4-Vision. Start with a text prompt, let DALLE-3 generate an image, then GPT-4 Vision turns that image back into a text prompt, DALLE-3 creates another image, and so on.

You need to bring your own OpenAI API key (costs about $0.10/run)

Some prompts are very stable, others go wild. If you bias GPT4's prompting by telling it to "make it weird" you can get crazy results.

Here's a few of my favorites:

- Gnomes: https://dalle.party/?party=k4eeMQ6I

- Start with a sailboat but bias GPT4V to "replace everything with cats": https://dalle.party/?party=0uKfJjQn

- A more stable one (but everyone is always an actor): https://dalle.party/?party=oxpeZKh5

epiccoleman · 2 years ago
It's pretty fun to mess with the prompt and see what you can make happen over the series of images. Inspired by a recent Twitter post[1], I set this one up to increase the "intensity" each time it prompted.

The starting prompt (or at least, the theme) was suggested by one of my kids. Watch in awe as a regular goat rampage accelerates into full cosmic horror universe ending madness. Friggin awesome:

https://dalle.party/?party=vCwYT8Em

[1]: https://x.com/venturetwins/status/1728956493024919604?s=20

ijidak · 2 years ago
"On January 19th 2024, the machines took Earth.

An infinite loop, on an unknown influencer's machine, prompted GPT-5 to "make it more."

13 hours later, lights across the planet began to go out."

civilitty · 2 years ago
Thanks for the inspiration! DallE is really good at demonic imagery: https://imgur.com/a/ng2zWTo

There's probably a disproportionate amount of Satanic material in the dataset #tinfoilhat #deepstate

bee_rider · 2 years ago
These kinds of super-bombastic demons also blast through the uncanny valley unscathed.
mnsc · 2 years ago
So your kid is also playing goat simulator? =D
epiccoleman · 2 years ago
I'm sure that's where he got the idea, and I definitely had some Goat Simulator imagery in my mind when I wrote the initial prompt ;)
andai · 2 years ago
Great idea asking it to increase the intensity each run. This made my evening!
epiccoleman · 2 years ago
Thanks! This was the custom prompt I used:

> Write a prompt for an AI to make this image. Just return the prompt, don't say anything else, but also, increase the intensity of any adjectives, resulting in progressively more fantastical and wild prompts. Really oversell the intensity factor, and feel free to add extra elements to the existing image to amp it up.

I played with it a bit before I got results I liked - one of the key factors, I think, was giving the model permission to add stuff to the image, which introduced enough variation between images to have a nice sense of progression. Earlier attempts without that instruction were still cool, but what I noticed was that once you ask it to intensify every adjective, you pretty much go to 11 within the first iteration or two - so you wind up having 1 image of a silly cat or goat and then 7 more images of world-shattering kaiju.

The goat one (which again, was an idea from one of my kids) was by far the best in terms of "progression to insanity" that I got out of the model. Really fun stuff!

taneq · 2 years ago
> Watch in awe as a regular goat rampage accelerates into full cosmic horror universe ending madness.

The longer the Icon of Sin is on Earth, the more powerful it becomes!

...wow that's pretty dramatic.

potatosalad21 · 2 years ago
Oh I am cackling. Thanks for sharing.
andrelaszlo · 2 years ago
Here's a custom prompt that I enjoyed:

"Think hard about every single detail of the image, conceptualize it including the style, colors, and lighting.

Final step, condensing this into a single paragraph:

Very carefully, condense your thoughts using the most prominent features and extremely precise language into a single paragraph."

https://dalle.party/?party=1lSMniUP

https://dalle.party/?party=cEUyjzch

https://dalle.party/?party=14fnkTv-

https://dalle.party/?party=wstiY-Iw

Praise the Basilisk, I finally got rate-limited and can go to bed!

SushiHippie · 2 years ago
Mine got surral real fast, though the sixth one is kinda cool https://dalle.party/?party=DNgriW_E
nathanfig · 2 years ago
These are fantastic
Blammar · 2 years ago
The thing that is truly mindboggling to me is that THE SHADOWS IN THE IMAGES ARE CORRECT. How is that possible??? Does DALL-E actually have a shadow-tracing component?
l33tman · 2 years ago
Research into the internals of the networks have shown that they figure out the correct 2.5D representation of the scene before the RGB textures (internally), so yes it seems they have an internal representation of the scene and therefore can do enough inference from that to make shadows and light seem natural.

I guess it's not that far-fetched as your brain has to do the same to figure out if a scene (or an AI-generated one for that matter) has some weird issue that should pop out. So in a sense your brain does this too.

throwaway290 · 2 years ago
I randomly checked a few links here and shadows were correct in 2 images out of a dozen... and any people tend to be horrifying in many
Rastonbury · 2 years ago
Stable diffusion does decent reflections too
jiggawatts · 2 years ago
Yes! It can also get reflections and refractions mostly correct.
re · 2 years ago
> https://dalle.party/?party=14fnkTv-

Interesting that for one and only one iteration, the anthropomorphized cardboard boxes it draws are almost all Danbo: https://duckduckgo.com/?q=danbo+character&ia=images&iax=imag...

It was surprising to see a recognizable character in the middle of a bunch of more fantastical images.

bee_rider · 2 years ago
Short focal length was a neat idea, it let it left lots of room for the subsequent iterations to fill in the background.
ArekDymalski · 2 years ago
> https://dalle.party/?party=1lSMniUP

It's very interesting to observe how the relationship between the wolf and Redhood evolved from dark and menacing to serene and friendly.

epiccoleman · 2 years ago
The fractal one is awesome!
w-m · 2 years ago
Playing with opposites is kind of fun, too.

Simply a cat, evolving into a lounging cucumber, and finally opposite world:

https://dalle.party/?party=pqwKQVka

Vibrant gathering of celestial octopus entities:

https://dalle.party/?party=lHNDUvtp

rbates · 2 years ago
This reminds me of the party game Telestrations where players go back and forth between drawing and writing what they see. It's hilarious to see the result because you anticipate what the next drawing will be while reading the prompt.

I'd love to see an alternative viewing mode here which shows the image and the following prompt. Then you need to click a button to reveal the next image. This allows you to picture in your mind what the image might like while reading the prompt.

Thanks for making this fun little app!

Update: I just realized you can get this effect by going into mobile mode (or resizing the window). You can then scroll down to see the image after reading the prompt.

neptudemon · 2 years ago
Reminds me of exquisite corpse, where folks take turns drawing a piece / writing a paragraph and can only see the most recent one (https://austinkleon.com/2020/07/02/exquisite-corpse/)
Mtinie · 2 years ago
I figured this would quickly go off the rails into surreal territory, but instead it ended up being progressive technological de-evolution.

Starting prompt: "A futuristic hybrid of a steam engine train and a DaVinci flying machine"

Results: https://dalle.party/?party=14ESewbz

(Addendum: In case anyone was curious how costs scale by iteration, the full ten iterations in this result billed $0.21 against my credit balance.)

Mtinie · 2 years ago
Here's a second run of the same starting prompt, this time using the "make it more whimsical" modifier. It makes a difference and I find it fascinating what parts of the prompt/image gain prominence during the evolutions.

Starting prompt: "A futuristic hybrid of a steam engine train and a DaVinci flying machine"

Results: https://dalle.party/?party=qLHPB2-o

Cost: Eight iterations @ $0.44 -- which suggests to me that the API is getting additional hits beyond the run. I confirmed that the share link isn't passing along the key (via a separate browser and a separate machine) so I'm not clear why this is might be.

jamestimmins · 2 years ago
I find it somewhat fascinating that in both examples, the final result is more cohesive around a single them than the original idea.
jm4 · 2 years ago
The second picture reminds me of Back to the Future III.
ChatGTP · 2 years ago
I like how in #9 the carriage is on fire, or at least steaming disproportionately.

These images are incredible but I often notice stuff like this and it kind of ruins it for me.

#3 & #4 are good too, when the tracks are smoking, but not the train.

xeckr · 2 years ago
Cool idea! I made one with the starting prompt "an artificial intelligence painting a picture of itself": https://dalle.party/?party=wszvbrOx

It consistently shows a robot painting on a canvas. The first 4 are paintings of robots, the next 3 are galaxies, and the final 2 are landscapes.

xaellison · 2 years ago
I tried something similar! Interestingly, picture 2 was what I wanted. After that... weirdness ensued https://dalle.party/?party=C2w7zuwe
NickNaraghi · 2 years ago
Great idea, and it came out really good too. I like the 6th one the best
eigenket · 2 years ago
In a few these pictures it seems to be heavily influenced by the adaptation of I Robot with Will Smith in it for what robots look like.
jsf01 · 2 years ago
It’s cool to see how certain prompts and themes stay relatively stable, like the gnome example. But then “cat lecturing mice” quickly goes off the rails into weird surreal sloth banana territory.

My best guess to try to explain this would be that “gnome + art style + mushroom” will draw from a lot more concrete examples in the training data, whereas the AI is forced to reach a bit wider to try to concoct some image for the weird scenario given in the cat example.

z991 · 2 years ago
Also, descent into Corgi insanity: https://dalle.party/?party=oxXJE9J4
morkalork · 2 years ago
Wow that meme about everything becoming cosmic/space themed is real isn't it?
pera · 2 years ago
substitute corgi with paperclip and you get another meme becoming real :p
ElijahLynn · 2 years ago
Love it! I forked yours with "Meerkat" and it ended up pretty psychedelic!

Got stuck on Van Gogh's "Starry Night" after a while.

https://dalle.party/?party=LOcXREfq

Also, love the simplicity of this idea, would love a "fork" option. And to be able to see the graph of where it originated.

mattigames · 2 years ago
I love how that took quite a dramatic turn in the third image, that truck is def gonna kill the corgi (my violent imagination put quite an image in my mind). But then DALL-E had a change of heart on the next image and put the truck in a different lane.
igrekel · 2 years ago
So do I understand correctly that the corgi was purely made up from GPT-4's interpretation of the picture?
z991 · 2 years ago
No, in that case there is a custom prompt (visible in the top dropdown) telling GPT4 to replace everything with corgis when it writes a new prompt.
ElijahLynn · 2 years ago
It was created by uploading the previous picture to GPT-4 to generate a prompt by using the vision API and using this prompt to create the new prompt:

"Write a prompt for an AI to make this image. Just return the prompt, don't say anything else. Replace everything with corgi."

Then it takes that new prompt and feeds it to Dall-E to generate a new image. And then it repeats.

Deleted Comment

chaps · 2 years ago
Absolutely wonderful. Thank you for sharing.
duggable · 2 years ago
The half mutilated corgi/star abomination in the top left got me good lol