DALL·E now available in beta

I was supposed to be making a video game, but got a bit sidetracked when DALL·E came out and made this website on the side: http://dailywrong.com/ (yes I should get SSL).

It's like The Onion, but all the articles are made with GPT-3 and DALL·E. I start with an interesting DALL·E image, then describe it to GPT-3 and ask it for an Onion-like article on the topic. The results are surprisingly good.

tiborsaas · 4 years ago

Thanks, finally a legit news publication :)

This was really funny :)

http://dailywrong.com/man-finally-comfortable-just-holding-a...

zanderwohl · 4 years ago

Somehow these articles are more readable than typical AI-generated search engine fodder... Is it because I'm entering the site with an expectation of nonsense?

biztos · 4 years ago

So the other men in the pictures are the uncomfortable ones?

firecall · 4 years ago

Yes, I actually LOLed at that one!

aantix · 4 years ago

Feels like the headlines could be generated similar to the style of "They Fight Crime!"

"He's a hate-fuelled neurotic farmboy searching for his wife's true killer. She's a tortured insomniac snake charmer from a family of eight older brothers. They fight crime!"

https://theyfightcrime.org/

Here's an implementation in Perl.

http://paulm.com/toys/fight_crime.pl.txt

edm0nd · 4 years ago

lol that site is great

>He's an unconventional gay paranormal investigator moving from town to town, helping folk in trouble. She's a violent motormouth wrestler from the wrong side of the tracks. They fight crime!

>He's a Nobel prize-winning sweet-toothed rock star who believes he can never love again. She's a strong-willed communist widow with a knack for trouble. They fight crime!

>He's an obese white trash barbarian with a secret. She's a virginal thirtysomething traffic cop with the power to bend men's minds. They fight crime!

walrus01 · 4 years ago

The results with things that are artworks or more general concepts are fascinating, but there is for sure something creepy with "photorealistic" human eyes and faces going on...

If you want to see some really creepy AI generated human "photo" faces, take a look at Bots of New York:

https://www.facebook.com/botsofnewyork

POPOSYS · 4 years ago

Unfortunately the content of that project is a hostage of Facebook now - similar to ransomware gangsters they force you to do something to get the data, in this case you need to create an account and take part in that global surveillance network. I do not understand why people do that.

jezzamon · 4 years ago

They intentionally prevented it from being able to general realistic faces to reduce the potential for deep fakes

benbristow · 4 years ago

From the server IP looks like you're on some managed WordPress hosting that only offers free SSL on the more 'premium' packages.

Easiest way for free SSL would be to just throw the domain on CloudFlare :)

lagrange77 · 4 years ago

http://dailywrong.com/new-course-teaches-guinea-pigs-househo...

lol

prawn · 4 years ago

We joke about it, but an early and very cheap robotic floor cleaner I had was one of those weasel balls constrained in a flat ring harness with a dusting cloth underneath. It was entertaining and not completely useless.

Put a guinea pig in there and you'd get the same effect.

tildef · 4 years ago

Actually got a chuckle out of the duck one (http://dailywrong.com/man-finally-comfortable-just-holding-a...). Thanks! I hope your keep generating them. Kind of wish there weren't a newsletter nag, but on the other hand it adds to the realism. Could be worthwhile to generate the text of the nag with gpt too; call it a kind of lampshading.

uxamanda · 4 years ago

The part where you have to confirm you are not a robot to subscribe to the mailing list is the best part of this, my new favorite website.

dntrkv · 4 years ago

Spam advertising is about to reach whole new levels of weird.

drusepth · 4 years ago

Haha, I was in a very similar boat when I built https://novelgens.com -- I was also supposed to be making a video game, but got a bit sidetracked with VQGAN+CLIP and other text/image generation models.

Now I'm using that content in the video game. I wonder if you could use these articles as some fake news in your game, too. :)

busyant · 4 years ago

This is clever. Does GPT-3 come up with the title of the article, too? That's the funniest part.

bemmu · 4 years ago

At first I came up with them myself, but found that it often comes up with better ones, so I ask it for variations.

I think I got it to even fill the title given a picture, something like “Article picture caption: Man holding an apple. Article title: ...”. Might experiment more with that in the future.

aasasd · 4 years ago

http://dailywrong.com/wp-content/uploads/2022/07/DALL%C2%B7E...

Hot dang. Some Reddit subs can be auto-generated now.

spyder · 4 years ago

It's already a thing:

https://www.reddit.com/r/SubSimulatorGPT2/

but yea, it will have generated images now.

housedrafta · 4 years ago

I had the same thought when I saw this one http://dailywrong.com/wp-content/uploads/2022/07/DALL%C2%B7E...

astroalex · 4 years ago

This is amazing! Honestly one of the first uses of GPT3/DALL E that has held my attention for longer than a few seconds.

bergenty · 4 years ago

Hilarious. Those first images have hurt feminism more than the GOP has in 30 years.

http://dailywrong.com/wp-content/uploads/2022/07/DALL%C2%B7E...

aantix · 4 years ago

Parenting > "Gillette Releases a New Razor for Babies"

bemmu · 4 years ago

I loved how it just consistently decided that if babies have facial hair, it's always white fluff.

mcintyre1994 · 4 years ago

Haha I love this one: https://dailywrong.com/why-your-dog-could-be-an-alien/

pieter_mj · 4 years ago

Very funny! The "Scientists Warn New Faster Toothbrush May Cause Insanity"-story is not fake though, I've experienced it ;)

lostgame · 4 years ago

This a fucking fantastic site, it’s absolutely hilarious, and I’ve bookmarked it - I kinda unironically want to set it as my home page - but just a heads up that the CSS is broken for me on my iPhone SE2.

The images don’t scale properly with the rest of the site, they’re massive compared to the content.

czhu12 · 4 years ago

These results are pretty amazing. Are these cherry picked / curated / edited at all?

pontusrehula · 4 years ago

Yes they are heavily cherry picked. The web site itself has a disclaimer about it.

layer8 · 4 years ago

This one seems like it could actually be real in Japan: http://dailywrong.com/anime-pillow-gym-opens-in-tokyo/ ;)

01100011 · 4 years ago

It would be interesting to see if there was a market for a monthly newsprint version similar to the old https://weeklyworldnews.com/

Can DALL-E render Bat Boy?

Aeolun · 4 years ago

I’m curious, if they’re only making DALL-E accessible now, and if GPT-3 was never really accessible (as far as I know). How do you have access to these things to generate text and images?

muzani · 4 years ago

There's no waitlist for GPT-3 now. DALL-E is an unrelated product; you don't need access to both.

blackoil · 4 years ago

DALL-E was accessible by invite/waiting queue. GPT-3 is available for pretty long.

sabertoothed · 4 years ago

You should let readers rate the articles. This way readers new to the site can read the best ones first and get a good impression.

jelliclesfarm · 4 years ago

Love it! Better than other news I get to read these days. Some of it rings..like the bluebird suing the cat.

Thank you! Bookmarked!

picozeta · 4 years ago

These are actually quite funny. A bit of a surreal touch, but that makes them even more fun.

edf13 · 4 years ago

Your images are coming over SSL - so won't show up on many browsers (E.g. Firefox)

naet · 4 years ago

I've had that idea since GPT 3 but never got any access...

Deleted Comment

bengkoang · 4 years ago

Try Auto-Install Free SSL plugin, it easy for me

mderazon · 4 years ago

This is great, I love it.

Why do the images load so slow though?

stuaxo · 4 years ago

This is fantastic, the fake news the world needs.

ttyyzz · 4 years ago

NGL this shit is pretty cursed and I like it.

ryanmercer · 4 years ago

This is the most wonderful thing ever.

dusted · 4 years ago

You get + two million points from me for not having HTTPS.

dash2 · 4 years ago

How do you generate the original image? And what about the subsequent images, do they come automatically from the text? I'd love to know more about the process.

Dead Comment

I have been having a blast with DALL-E, spending about an hour a day trying out wild combinations and cracking my friends up. I cannot imagine getting bored of it; it's like getting bored with visual stimulus, or art in general.

In fact, I've been glad to have a 50/day limit, because it helps me contain my hyperfocus instincts.

The information about new pricing is, to me as someone just enjoying making crazy imagines, a huge drag. It means that to do the same 50/day I'd be spending $300/month.

OpenAI: introduce a $20/month non-commercial plan for 50/day, and I'll be at the front of the line.

dave_sullivan · 4 years ago

I think people don't realize how huge these models really are.

When they're free, it's pretty cool. But charge an amount where there's actual profit in the product? Suddenly seems very expensive and not economically viable for a lot of use cases.

We are still in the "you need a supercomputer" phase of these models for now. Something like DALLE mini is much more accessible but the results aren't good enough. Early early days.

sinenomine · 4 years ago

> I think people don't realize how huge these models really are.

They really aren't that large by the contemporary scaling race standards. DALLE-2 has 3.5B parameters, which should fit on an old GPU like Nvidia RTX2080, especially if you optimize your model for inference [1][2] which is commonly done by ML engineers to minimize costs. With optimized model, your memory footprint is ~1 byte per parameter, and some less than 1 ratio (commonly ~0.2) of all parameters to store intermediate activations.

You should be able to run it on Apple M1/M2 with 16GB RAM via CoreML pretty fine, if an order of magnitude slower than on an A100.

Training isn't unreasonably costly as well: you can train a model given O(100k)$ which is less than a yearly salary of a mid-tier developer in silicon valley.

There is no reason these models shouldn't be trained cooperatively and run locally on our own machines. If someone is interested in cooperating with me on such a project, my email is in the profile.

1. https://arxiv.org/abs/2206.01861

2. https://pytorch.org/blog/introduction-to-quantization-on-pyt...

TigeriusKirk · 4 years ago

What are the resources at work here?

What are the resources needed to train this model?

If someone just gave you the model for free, what resources would you need to use it to generate new results?

spaceman_2020 · 4 years ago

How hard would it be to spin off a variant of this with more focused data models that cater to specific styles or art-types? Like say, a data model only for drawing animals. Or one only for creating new logos?

jnovek · 4 years ago

My heart sank when I saw the pricing model.

I’ve been creating generative art since 2016 and I’ve been anxiously waiting for my invite. I wont be able to afford to generate the volume of images it takes to get good ones at this price point.

I can afford $20/mo for something like this but I just can’t swing $200 to $300 it realistically takes to get interesting art out of these CLIP-centric models.

Heck, the initial 50 images isn’t even enough to get the hang of how the model behaves.

wongarsu · 4 years ago

MidJourney is a good alternative. Maybe not quite as good as DALL-E, but close enough, without a waitlist and with hobby-friendly prices ($10/month for 200 images/month, or $30 for unlimited)

blueboo · 4 years ago

If you’re technically inclined, I urge you to explore some newer Colabs being shared in this space. They offer vastly more configurable tools, work great for free on Google Colab, are straightforward to run on a local machine.

Meanwhile we should prepare ourselves for a future where the best generative models cost a lot more as these companies slice and dice the (huge) burgeoning market here.

nuclearsugar · 4 years ago

FYI, each text prompt submission uses 1 credit and then renders out 4 images.

peab · 4 years ago

You can also try out wombo Dream - their newest version is similar to mid journey, and is unlimited

tough · 4 years ago

They have a form for discounted plans if you can't afford it, might be worth a try too

pkaye · 4 years ago

I'm sure the prices will go down each year as the computing costs go down.

pjgalbraith · 4 years ago

Yeah I've been having fun with it recreating bad Heavy Metal album art (https://twitter.com/P_Galbraith/status/1548597455138463744). It's good, but surprisingly difficult to direct it when you have a composition in mind. A few of these I burned through 20-30 prompts to get and I can't see myself forking up hundreds of dollars to roll the dice.

My brother is a digital artist and while excited at first he found it to be not all that useful. Mainly because it falls apart with complex prompts, especially when you have a few people or objects in a scene, or specific details you need represented, or a specific composition. You can do a lot with in-painting but it requires burning a lot of credits.

nsxwolf · 4 years ago

I'm already bored of it. When you have everything, you have nothing.

danielvaughn · 4 years ago

I'm sure the novelty wears off. But I'm already coming up with several applications for it.

On the personal side, I've been getting into game development, but the biggest roadblock is creating concept art. I'm an artist but it takes a huge amount of time to get the ideas on paper. Using DALLE will be a massive benefit and will let me expedite that process.

It's important to note that this is not replacing my entire creative process. But it solves the issue I have, where I'm lying in bed imagining a scene in my mind, but don't have the time or energy to sketch it out myself.

peteforde · 4 years ago

I don't know how to say this without sounding like a jerk, even if I bend over backwards to preface that this isn't my intent: this statement says more about your creativity and curiosity than a ceiling on how entertaining DALL-E can be to someone who could keep multiple instances busy, like grandma playing nine bingo cards at once.

Knowing that it will only get better - animation cannot be far behind - makes me feel genuinely excited to be alive.

ryanmercer · 4 years ago

Same. I generated several thousand images and found it a chore, outside of the daily theme on the discord server, to try and even think of anything to query. It was also discouraging when sometimes you'd hit pure gold for 4-5 of the 6 images, then you'd be lucky to get 1 out of the 6 that was worth saving for several more queries. Now it's down to 4 images and... yeah...

I'm not going to try and profit from the images, I don't need them for any business uses or anything, so to me it was a fun for a while and now just something I'll largely put out of mind.

muzani · 4 years ago

I was actually forcing myself to go through the whole 50/day because I knew it wouldn't be free forever, and I wanted to get better at it. I'm glad I did, but I wish I did more.

Deleted Comment

Filligree · 4 years ago

MidJourney gives ~unlimited generation for $30/month, and is nearly as good. Unlike DALL-E it doesn't deliberately nerf face generation. I've been having a blast.

theAIartist · 4 years ago

It's pretty good at faces now. https://labs.openai.com/s/BAdAExWMdtyyC5VJVxXF5wdd

muzani · 4 years ago

Both seem equally bad with face generation, but Midjourney works really well with famous people (e.g. Trump).

andreyk · 4 years ago

Check out Artbreeder, it is likewise a ton of fun!

Multimodal.art (https://multimodal.art/) is working on a free version of something like DALLE, though it's not that good as of yet.

irrational · 4 years ago

Sounds kind of like scribblenauts. I would try the craziest things to see what it could come up with.

the_doctah · 4 years ago

>In fact, I've been glad to have a 50/day limit, because it helps me contain my hyperfocus instincts.

This belongs on /r/linkedinlunatics

commandlinefan · 4 years ago

> trying out wild combinations and cracking my friends up

Wait until the next edition comes out where it automatically learns the sorts of things that crack you up and starts generating them without any input from you.

jamiek88 · 4 years ago

Ai generated TikTok could be almost like wire jacking humans.

Dead Comment

DecayingOrganic · 4 years ago

Since many people will start generating their first images soon, be sure to check out this amazing DALL-E prompt engineering book [0]. It will help you get the most out of DALL-E.

[0]: https://dallery.gallery/wp-content/uploads/2022/07/The-DALL%... (PDF)

samstave · 4 years ago

AMAZING, Thank you.

I hope that every science teacher that can - provide this to every student. This is the future they live in now. They should know these as well as they know how to install an app on a device.

Wait until we have a DALL-E -- Enabled Custom EMOJI stream - whereby, every text you send out has it corresponding DALL-E resultant image for every txt --

Then we can compare images from different people at different times but the prompt was identical... and see what the resultant library of emoji<-->PROMPT looks like?

What about using Dall-e as a watermark for 'nft' signature 'notary' of an email.

If DALL-E provided a unique PID# for every image - and that PID was a key that only the OP runner of the image has - it can be used to authenticate an image to a text source... ??? (Assuming that no two prompts have the same result ever, but assigning a unique id that CAN be used to replay the image to verify it was generated when an original email/SMS was actually sent - it could be a unique way to timestamp authenticity/provenance of a thing...

uplifter · 4 years ago

Thanks for this! A bit of prompt engineering know-how will help me get the most bang for the buck out of this beta. I also just want to say that dallery.gallery is delightfully clever naming.

ZeWaka · 4 years ago

This is absolutely amazing. Thanks!

khiner · 4 years ago

Thank you, this is great!

alexalx666 · 4 years ago

thanks! love the link highlight

ru552 · 4 years ago

nice write up, thanks

softwaredoug · 4 years ago

Surprised by the lack of comments on the ethics of DALL-E being trained on artists content whereas copilot threads are chock full of devs up in arms over models trained on open source code. Isn’t it the same thing?

anhncommenter25 · 4 years ago

I recently talked with a concept artist about DALL-E and first thing they mentioned was "you know that's all stolen art, right?" Immediately made me think of GitHub Copilot.

However the artists being featured in DALL-E's newsletters can't stop gushing about 'the new instrument they are learning how to play' and other such metaphors that are meant to launder what's going on.

My theory is that the professions most at-risk for automation are acting on their anxieties. Must not be a lot of freelance artists on HN, and a whole lot of programmers.

I think the artists have an even clearer case. I don't think GitHub Copilot is ready to steal anyone's job yet. But DALL-E is poised to replace all formerly commissioned filler art for magazines, marketing sites, and blogs. Now the only point to hiring a human is to say you hired a human. Our filler art is farm-to-table.

imdsm · 4 years ago

Having used copilot for over a year now, it isn't there to replace programmers. It isn't called GitHub Pilot, and it doesn't do well with generating original ideas. Sure, if your job is to create sign up forms in HTML then sure, it'll do your job in a second, but if you're creating more complex systems, copilot is just there to help save you time when writing code (which is just implementing ideas).

Think of it like a set of powertools saving you time over manual tools.

anhner · 4 years ago

I first read the artist's reply as "you know all art is stolen, right" which made more sense to me. If you look at the history of art, you'd also know that it's true.

jraph · 4 years ago

> My theory is that the professions most at-risk for automation are acting on their anxieties

That's not my problem with Copilot. I think tools and methods that can free human from some amount of work are good in a correctly organized society. They have been existing for a long time, too. They let us free time for other stuff that can't be automated. This extra free time could theoretically let us have more leisure or rest time too. I also trust myself to be able to learn another job if mine can ever be automated.

But I don't want my work to be reused under terms I don't approve of. There are some things I don't want to help with my work and this is reflected in the licenses I choose. I totally sympathize with artists who don't want their work to be reused in ways they don't like. I don't find this hard to understand. I also don't find it hard to understand that if an artist do some work that you should pay for to use is not happy with their work being reused without being paid. They should get paid a tiny bit for each generated art if theirs is in the training set, and only if they approve this use. That's would be only fair, the set would not be possible without those artists.

(Good for me, my personal code is not on GitHub for other, older, reasons)

tekni5 · 4 years ago

This entire concept of AI learning using copyrighted works is going to be really tested in courts at some point, perhaps very soon, if not already.

However if the result is adequately different, I don't see how it is different from someone viewing other's work and then being "inspired" and creating something new. If you think about it the vast majority of things are built on top of existing ideas.

trop · 4 years ago

Quite true. Best case, we're seeing DJ Spooky style culture jamming/remixing. But more likely it is as you write.

On the other hand, the market for stock photography was already decimated by the internet. Where previously skilled photographers would create libraries of images to exemplify various terms and sell these as stock, in the last decade or so, an art director with the aid of a search engine could rapidly produce similar results.

scifibestfi · 4 years ago

Where did they get the art from anyway? Do they have a list of sources anywhere?

Gigachad · 4 years ago

Of course. Because the majority of the tech bros on this site are self centered and think of arts as a lowly field deserving of no respect. While something slightly resembling some boilerplate lego code they wrote is a criminal act to learn from.

orange8 · 4 years ago

If you really want to learn, visit github.com. There are over 200 million freely available, open source code repositories for you to study and learn from.

bodge5000 · 4 years ago

Surely being suprised by the lack of comments on the ethics of DALL-E on HN is the same as the lack of comments on the ethics of co-pilot on some artists forum. I highly doubt you're going to find r/artists or whatever up in arms about co-pilot, even if they are about DALL-E.

darawk · 4 years ago

Why is this any worse than an art student learning to paint by looking at other painter's work?

You are surprised that human beings react more strongly to things that more directly threaten them?

lionkor · 4 years ago

It is bad. But:

As long as DALL-E isn't caught painting out a 1-to-1, reverse searchable copy of an image, its not really as bad as copilot, IMO.

The issue isnt just that copilot is trained on my GPL code, its that it might decide to copy paste lines from it, including my comments, etc.

"DALL-E being trained on artists content"

Well, I can't go ask Caravaggio or Gentileschi to paint my query since they've been dead hundreds of years. But being able to to feed a query containing much more modern concepts in and get a baroque style painting in that specific style is wonderful.

Plus what has already been said about a lot of art being an imitation/derivation of previous works.

BeFlatXIII · 4 years ago

It's because the furor over AI replicating human artists already played out over earlier AI iterations. Remember when thisfursonadoesnotexist.com was flamed for stealing furry art? Turns out that many artists shared an extremely generic style that the AI could easily replicate.

jfoster · 4 years ago

It feels like it would be good for this to not be a legal grey area. Whether it's considered a large copyright infringement conspiracy or a form of fair use, it would be good if the law reached a position on that sooner rather than later.

marcosdumay · 4 years ago

Is it trained on unlicensed work gathered at random at the web?

(I really don't know, and I didn't find anything about it on their site.)

In a way, it's no different than an artist walking through an art gallery then going home, inspired, to paint a dark portrait a la Rembrandt

noahbradley · 4 years ago

There are a few of those discussions going on in artist's circles these days. I imagine they'll get sued for doing this, but it'll probably take a very famous artist or a hell of a class action suit to make it happen.

miohtama · 4 years ago

Another question related to ethics

> Preventing harmful images: We’ve made our content filters more accurate so that they are more effective at blocking images that violate our content policy — which does not allow users to generate violent, adult, or political content

What is defined as political content? Can I prompt DALL-E to draw ”Fat Putin”?

macjohnmcc · 4 years ago

No. Just tried.

WatchDog · 4 years ago

good artists borrow, great artists steal

nharada · 4 years ago

Something I haven’t seen anyone talking about with these huge models: how do future models get trained when more content online is model generated to start with? Presumably you don’t wanna train a model on autogenerated images or text, but you can’t necessarily know which is which.

mikeyouse · 4 years ago

This precise thing is causing a funny problem in specialty areas. People are using e.g. Google Lens to identify plants, birds and insects, which sometimes returns wrong answers e.g. say it sees a picture of a Summer Tanager and calls it a Cardinal. If the people then post "Saw this Cardinal" and the model picks up that picture/post and incorporates it into its training set, it's just reinforcing the wrong identification..

scarmig · 4 years ago

That's not really a new problem, though. At one point someone got some bad training data about an old Incan town, the misidentification spread, and nowadays we train new human models to call it Macchu Picchu.

bobbylarrybobby · 4 years ago

https://xkcd.com/978/

Imagine when there is an AI that is monitoring content creation and keeping tabs of original sources....

Pxtl · 4 years ago

Then that's a cardinal now.

goolulusaurs · 4 years ago

It's a cybernetic feedback system. Dalle is used to create new images, the images that people find most interesting and noteworthy get shared online, and reincorporated into the training data, but now filtered through human desire.

jmartrican · 4 years ago

I wonder if human artists can demand that their work not be used for modelling. So as the robots are stuck using older styles for their creations, the humans will keep creating new styles of art.

qualudeheart · 4 years ago

They’ll ignore it either way.

cpach · 4 years ago

Makes me think of Ouroboros

https://en.m.wikipedia.org/wiki/Ouroboros

Reminds me of https://en.wikipedia.org/wiki/Low-background_steel

naillo · 4 years ago

One interesting comment about this is that some models actually benefit from being fed their own output. Alphafold for instance was fed with its own 'high likelihood' outputs (as demis hassabis described in his lex friedman interview).

blfr · 4 years ago

Training on auto generated images collected off the Internet is gonna be fine for a while since the images surfacing will be curated (ie. selected as good/interesting/valuable) still mostly by humans.

gwern · 4 years ago

My discussion of this issue (which actually comes up in like every DALL-E 2 discussion on HN): https://www.lesswrong.com/posts/uKp6tBFStnsvrot5t/what-dall-...

Jleagle · 4 years ago

The images i have created all have a watermark.. This is at least one way to filter out most images, by the same AI at least.

It’s trivial to remove the watermark and other tools put no watermark on.

zitterbewegung · 4 years ago

This should be a step in cleaning your data to begin with. If you don't know the providence of your data then you shouldn't be even training with it.

Getting humans to refine your data is the best solution right now and many companies and researches go with this approach.

sailingparrot · 4 years ago

> Getting humans to refine your data is the best solution right now

Source ?

All those big models are trained with data for which the source is not known or vetted. The amount of data needed is not human-refinable.

For example for language models we train mostly on subsets of CommonCrawl + other things. CommonCrawl data is “cleaned” by filtering out known bad sources and with some heuristics such as ratio of text to other content, length of sentences etc.

The final result is a not too dirty but not clean huge pile of data that comes from millions of sources that no human as vetted and that no one in the team using the data knows about.

The same applies to large images dataset, e.g. Laon 400m that also comes from CommonCrawl and is not curated.

FrenchDevRemote · 4 years ago

You can't use humans to manually refine a dataset on the scale of GPT-3 or DALL-E

Clip was trained on 400,000,000 images, GPT is roughly 180B tokens, at 1-2 tokens per word, that's 120,000,000,000 words.

But how would you know? A random string of text or an image with the watermark removed is going to be very hard to distinguish generated from human written.

jazzyjackson · 4 years ago

s/ide/ena

can16358p · 4 years ago

I think with the terms requiring explicitly telling which images/parts were generated, they could be filtered out and prevent a feedback loop of "generated in/generated out" images. I'm sure there will be some illegal/against terms of use cases there but the majority should represent fair use.

xnx · 4 years ago

I fully expect stock image sites to be swamped by DALL-E generated images that match popular terms (e.g. "business person shaking hands"). Generate the image for $0.15. Sell it for $1.00.

thorum · 4 years ago

DALLE images are still only 1024 px wide. Which has its uses, but I don’t think the stock photo industry is in real danger until someone figures out a better AI superresolution system that can produce larger and more detailed images.

arecurrence · 4 years ago

You can obtain any size by using the source image with the masking feature. Take the original and shift it then mask out part of the scene and re-run. Sort of like a patchwork quilt, it will build variations of the masked areas with each generation.

Once the API is released, this will be easier to do in a programmatic fashion.

Note: Depending on how many times you do this... I could see there being a continuity problem with the extremes of the image (eg: the far left has no knowledge of the far right). An alternative could be to scale the image down and mask the borders then later scale it back up to the desired resolution.

This scale and mask strategy also works well for images where part of the scene has been clipped that you want to include (EG: Part of a character's body outside the original image dimensions). Scale the image down, then mask the border region, and provide that to the generation step.

csteubs · 4 years ago

Another commenter mentioned Topaz AI upscaling, and Pixelmator has the "ML Super Resolution" feature; both work remarkably well IMO. There are a number of drop-in and system default resolution enhancement processes that work in a pinch, but the quality is lacking compared to the commercial solutions. There are still some areas where DALL-E 2 is lacking in realism, but anyone handy with photo editing tools could amend those shortcomings fairly quickly.

On-demand stock photo generation probably is the next step, particularly when combined with other free media services (Unsplash immediately comes to mind). Simply choose a "look" or base image, add contextual details, and out pops a 1 of 1 stock photo at a fraction of the cost of standard licensing. It'll be very exciting seeing what new products/services will make use of the DALL-E API, how and where they integrate with other APIs, use cases, value adds like upscaling and formatting, etc.

eigenvalue · 4 years ago

I've been using this app to upscale the images to 4000x4000, and it works amazingly well (there is also a version for Android):

https://apps.apple.com/us/app/waifu2x/id1286485858

I paid extra to get the higher quality model using the in-app purchase option. It crushes the phone's battery life, but runs in only ~10 seconds on an iPhone 13 Pro for a single 1000x1000 input image.

DallE2 + Topaz Gigapixel AI works amazingly well.

wishfish · 4 years ago

Makes me imagine stock image sites in the near future. Where your search term ("man looks angrily at a desktop computer") gets a generated image in addition to the usual list of stock photos.

Maybe it would be cheaper. I imagine it would one day. And maybe it would have a more liberal usage license.

At any rate, I look forward to this. And I look forward to the inevitable debates over which is better: AI generation or photographer.

smusamashah · 4 years ago

They won't. DALL-E images are mostly not as high quality. The high quality stuff which everyone has been sharing is result of lots of cherry picking.

speedgoose · 4 years ago

In my experience it doesn’t require that much cherry picking if you use a carefully crafted prompt. For example: “ A professional photography of a software developer talking to a plastic duck on his desk, bright smooth lighting, f2.2, bokeh, Leica, corporate stock picture, highly detailed”

And this is the first picture I got: https://labs.openai.com/s/lSWOnxbHBYQAtli9CYlZGqcZ

It got it a bit strong on the depth of field and I don’t like the angle but I could iterate a few times and get a good one.

Even the high quality stuff still can't do human faces right.

messe · 4 years ago

If the price is low enough, you can have humans rank generated images (maybe using Mechanical Turk or a similar service), and from that ranking choose only the highest quality DALL-E generated images.

Forge36 · 4 years ago

If someone can make money doing it they might.

Heck: If the cost to entry is prohibitively low they might do it at a loss and take over the site

andybak · 4 years ago

It's a lot better than you are claiming. Mind if I ask if you have access personally?

nprateem · 4 years ago

Give it a few years. I'd be exiting if I owned a stock site

dylanlacom · 4 years ago

Eh, I’d bet the arbitrage window is pretty brief, and that prices will fall closer to $0.15 pretty quickly.

dymk · 4 years ago

They'll likely immediately go out of business, because I can just pay OpenAI 15 cents directly for the exact same product.

dx034 · 4 years ago

So what's the loss? It's not like stock photos are the highest art form. Surely, for some people it means they need to change their business model, but all those just needing pictures to illustrate something the process will be much smoother.

ploppyploppy · 4 years ago

"buy fo' a dollar, sell fa' two" - Prop. Joe

sogen · 4 years ago

King stays the king!

redox99 · 4 years ago

DALL-E 2 isn't good enough for such photorealistic pictures with humans as of yet however.

bpicolo · 4 years ago

https://twitter.com/TobiasCornille/status/154972906039745331...

Unless I'm missing something, these seem pretty darn good

There has been trouble with generating life-like eyes but a second pass with a model tuned around making realistic faces has been very successful at fixing that.

JulianaRestrepo · 4 years ago

remember DALL-E is not allowed to generate faces

No longer true.

jawns · 4 years ago

One of the commercial use cases this post mentions is authors who want to add illustrations to children's stories.

I wonder if there is a way for DALL-E to generate a character, then persist that character over subsequent runs. Otherwise, it would be pretty difficult to generate illustrations that depict a coherent story.

Example ...

Image 1 prompt: A character named Boop, a green alien with three arms, climbs out of its spaceship.

Image 2 prompt: Boop meets a group of three children and shakes hands with each one.

TaupeRanger · 4 years ago

You can't do that. I can't see this working well for children's book illustrations unless the story was specifically tailored in a way that makes continuity of style and characters irrelevant.

CobrastanJorji · 4 years ago

As an aside, Ursula Vernon did pretty well under the constraint you described. She set a comic in a dreamscape and used AI to generate most of the background imagery: https://twitter.com/UrsulaV/status/1467652391059214337

It's not the "specify the character positions in text" proposed, but still a neat take on using this sort of AI for art.

WalterSear · 4 years ago

I would expect continuuity to be a relatively simple feature to retrain for and implement.

minimaxir · 4 years ago

You can cheat this to a limited extent using inpainting.

rahimnathwani · 4 years ago

You mean just generate a single large image with all the stuff you want for the whole story, and then use cropping and inpainting to get only the piece you want for each page?

You cannot. But a workaround would be to say something like “generate an alien in three different poses— running, walking, waving”

Then use inpainting to only preserve that pose and generate new content around it. It’s definitely not perfect.

londons_explore · 4 years ago

You can do better than this. Draw/generate your character.

Then put that at the side of a transparent image, and use as the prompt, "Two identical aliens side by side. One is jumping"

hiidrew · 4 years ago

I think you can feed it a link to images for inspiration. Wondering if you could just pass the first image to retain 'Boop'.

isoprophlex · 4 years ago

Wait until someone trains a model like this, for porn.

There seems to be a post-DALLE obscenity detector on openAI's tool, as so far I've found it to be entirely robust against deliberate typos designed to avoid simple 'bad word lists'. Ask it for a "pruple violon" and you get purple violins... you get the deal.

"Metastable" prompts that may or may not generate obscene (content with nudity, guns, violence as I've found) results sometimes shown non-obscene generations, and sometimes trigger a warning.

jug · 4 years ago

I’ve thought about this and in fact porn generation sounds like a good thing?? It ensures that it’s victimless. Of course, there is a problem with generation of illegal (underage) porn but other than this, I think it could be helpful for this world.

If all of the child porn industry switched to generated images they'd still be horrible people but many kids would be saved from having these pictures taken. So a commercial model should certainly ban it, but I don't think it's the biggest thing we have to worry about.

ffggvv · 4 years ago

i wonder if you could say “person who looks 15 but is 18”

jowday · 4 years ago

If I had to guess, I'd bet they have a supervised classifier trained to recognize bad content (violence, porn, etc) that they use to filter the generated images before passing them to the user, on top of the bad word lists.

spread_love · 4 years ago

This is mentioned, "content filters" are "blocking images that violate our content policy — which does not allow users to generate violent, adult, or political content, among other categories" and they "limited DALL·E’s exposure to these concepts by removing the most explicit content from its training data."

cmarschner · 4 years ago

Most likely they just take the one from bing. Or, if they trained a better one, it goes vice versa sooner or later

Exactly!

zionic · 4 years ago

Honestly that part pisses me off. Who cares if their AI "makes porn" or something "offensive".

fishtoaster · 4 years ago

I suspect it's more a business restriction than a moral one. If OpenAI allows people to make porn with these tools, people will make a ton of it. OpenAI will become known as "the company that makes the porn-generating AIs," not "the company that keeps pushing the boundaries of AI." Being known as the porn-ai company is bad for business, so they restrict it.

The part I really don't understand the purpose of the threat "Further policy violations may lead to an automatic suspension of your account".

Why do that? Just refusing to run my query is sufficient. Who is harmed if I continue to bang my head against that wall?

alana314 · 4 years ago

I tried the term "cockeyed" and got a TOS violation notice

seydor · 4 years ago

Porn is cheaper to make, and probably pleasant. But, fantasy porn is not. I can see it sparkling a revival of fantastic erotica