DALL-E + GPT-3 = ♥ - Readit News

Wow, OpenAI should be truly taught in business schools around the world as a case study in product marketing. Despite the dozens or hundreds of similar tools, models, colab notebooks, etc. created by a wide flourishing community, these two products are top of mind for everyone. Both terribly disappointing and envy inspiring for us in the field!

centra_minded · 4 years ago

GPT-3 and DALL-E (by my subjective opinion) feel much better than other models that their respective tasks.

DALL-E in particular kind of blows the VQGAN+CLIP messing around I've done out of the water. GPT-3 feels markedly better than other text generation or chatbots I've tried.

Definitely these are well marketed, and not the only models aorund, but they also feel ahead of other things I've tried. Can you point out some of these other tools/models?

andybak · 4 years ago

It's not as simple as "x is better than y". They all have their own flavours. I'd had results from JAX CLIP Guided Diffusion that I can't get from anything else and some of my early experiments with Disco Diffusion have a quality that is unique. I think people will always mix and match models due to their unique qualities.

Having said that I'm on the beta for Stable Diffusion ( https://stability.ai/ ) and it's remarkably capable across a broad range of styles. Dall-E probably still has the edge for more complex semantic prompts and photographic coherance but it's very good and it's got a very open strategy.

mach1ne · 4 years ago

OpenAI was the first on GPT-level text generation, as it was (I think) on DALL-E level image generation. It's winner takes all rather than marketing.

minimaxir · 4 years ago

Google has similar models and could compete with OpenAI with both a GPT-3 and DALL-E equivalent if they wanted to.

tbh I'm surprised they don't.

M4v3R · 4 years ago

They already have, they’ve developed two projects for AI image generation: Imagen and Parti. Sadly both are not open to public yet.

I’ll dare make a prediction: Transformers will be declared a dead end in AI in few years and GPT-4 might be the last in line. The method is able to produce highly convincing gimmicks (which is what it comes to once initial excitement splashes) but that is not how art or intelligence works.

lbotos · 4 years ago

> The method is able to produce highly convincing gimmicks (which is what it comes to once initial excitement splashes) but that is not how art or intelligence works.

Uh, art is often about reproducing highly convincing gimmicks.

See https://en.wikipedia.org/wiki/Realism_(arts)

I think you mean that Transformers will become kitsch, which probably.

etrautmann · 4 years ago

We've already seen this to some extent with GANs - which now look somewhat outdated in comparison with Dall-e and others

Jack000 · 4 years ago

Language model != Transformers. The scaling power of the transformer architecture allows LLMs to exist, but LMs are just one application of transformers. The architecture itself is extremely generic and is applicable to nearly every problem.

I'm pretty certain whatever comes after GPT-4 will still be using layered self-attention.

ebiester · 4 years ago

Everything is eventually called a dead end in AI, no? Once it has moved from the bleeding edge to something the public uses, the limitations become known and everyone says, "of course a computer can do it, but that's not intelligence."

feet · 4 years ago

That's because we aren't building systems capable of self-analysis and awareness. As far as I'm aware, there are only AI/ML systems capable of analyzing some data and giving some output with little to no nuance

Rerarom · 4 years ago

no, the OP says it in a different sense, namely that even if you are duly impressed by these advancements, they will stop in a few years and a new line of research will be needed

Deleted Comment

Tistron · 4 years ago

I made a quick comparison with midjourney, without going as much into picking pictures.

https://hackmd.io/@einarmagnus/midjourney-hallucinations

It's quite fascinating to play around with this tool. Like user `stared` mentions below, there are words like epic, realistic etc that you can add to shape the pictures quite a lot. Auto-generated text like this is not so great at producing interesting results.

stared · 4 years ago

For these prompts, I like the ones generated by Midjouney much more. While they are less diverse, they are more consistent for a digital art style. Thank you for making this comparison!

dymk · 4 years ago

Firstly, it's cool. But...

The generated images are underwhelming. Other than the snake king one, they all kind of look the same album art look. The generated text also seems like it's trying to have some cohesive theme to it, but doesn't bring it together like one might expect in a short story. And the writing style is filled with cliches.

I can see it being good for inspiration, but it's pretty rough for final consumption.

coffee_beqn · 4 years ago

The city was especially off. It was literally just a city. I realize this would be an absurd complaint just a few years ago

mrtksn · 4 years ago

Midjourney seems to have better hunch for artistic drama. Though I'm not very satisfied with the AI outcomes, probably because they don't watch movies(yet!).

Here is the output for the first prompt[0]: https://imgur.com/a/18NRrx4

[0]You see a figure in the distance, waving at you. As you get closer, you realize that the figure is your doppelganger. Your doppelganger is wearing your clothes and has your face, but their eyes are black voids. You hear a voice in your head that says, “We are one. You are me and I am you.”

jimhi · 4 years ago

I have noticed this too. Midjourney seems to have poorer quality and can't really do face but is vastly more "creative".

I wonder if there is a trade off between a larger data set so more accurate and how creative it can be.

aabhay · 4 years ago

freediver · 4 years ago

Prompts matter - here, it is interesting to see the cascade of prompts, i.e. from ones you pass to GPT3 to those generated by it.

I also like the topic choice - getting DALLE2 esoteric, dreamy, surreal. I find generating things that do not exist more interesting - as it stretches the limits of AI creativity and imagination. (Plug: I generated quite a few religious, symbolic, and esoteric images, both pleasant and gloomy, https://pmigdal.medium.com/dall-e-2-and-transcendence-3a3a40...)

lee101 · 4 years ago

Checkout https://text-generator.io for an api compatible switch from gpt-3 for generating text or code but at a reasonable price, there’s some upcoming alternatives got Dall-e too like disco diffusion or stability.ai one day would be a easy switch to a cheaper or better quality ( or both) service from dall-e so keep an eye out for competition. There’s monopoly level pricing in openai right now so be careful y’all

bitwize · 4 years ago

In addition their M-x psychoanalyze-zippy energy, these kind of remind me of the creepypasta-fodder PlayStation game LSD: https://www.youtube.com/watch?v=ol4OSIGGukA

Is the fire god going to tell me to write a book report on Call of the Wild? ( https://www.youtube.com/watch?v=KXVdTT1Sis0&t=6m42s )

totalview · 4 years ago

I totally thought that this was a long wind up to the author explaining how perfect GPT-3 and DALLE were for creating Dungeons and Dragons narrative

ahstilde · 4 years ago

That exists!

https://play.aidungeon.io/