Readit News logoReadit News
tbalsam · 4 years ago
Good to see LucidRains get the love he (rightly) deserves. He's a beast!

As a thank you to him -- he also does for work for commission/etc, check hus GitHub page for more info. I'm not fiscally or currently otherwise directly linked to him too closely, I've just hung around a while and think he deserves far more credit than he gets. This is literally the smallest piece of the pie of what the man does across several subdisciplines, send him a thank-you please, if possible!

Deleted Comment

emadm · 4 years ago
Second this, one of the biggest contributors to open source AI and someone everyone should sponsor.
TekMol · 4 years ago
What is the reason Google published their research details about Imagen?

Why don't they just keep their findings to themselfes and build products on top of them?

Public companies can't do stuff just for the fun of it, right? So there must be some commercial reasoning behind it?

nmfisher · 4 years ago
Two reasons: 1) Even though it's all technically very impressive, so far there's not a huge amount of commercialization potential here. OpenAI is charging for its GPT-3 model but its revenue is probably negligible next to the hardware costs (sunk + ongoing) to train it in the first place, let alone the researcher salaries they're paying 2) Most of the stunning examples are cherry-picked. These things fail much more often than they are willing to admit, and they're probably assuming (correctly) that not enough people are willing to pay for something that only-sorta-kinda-works 1/3 of the time, when you're holding it the right way.
adamhowell · 4 years ago
I'm currently working fulltime on AI-powered design suite Accomplice (https://accomplice.ai) and if you ask me on a good day I would tell you I do think there's already huge commercial potential. On a bad day, though ;)

My current approach is a "model marketplace" (https://accomplice.ai/models) where the most popular open source text-to-image models (VQGAN+CLIP, Disco Diffusion, DALL-E Mega coming soon…), sit alongside the most popular open source style transfer models, and then finally I have the ability for a user to finetune their own models using a simple drag-and-drop tool (https://accomplice.ai/no-code-model-training).

Using this approach a user has enough models to try or train that they can have a higher hit rate. For example, Accomplice currently has finetuned models for photo realistic people (https://accomplice.ai/models/f58bfa91-bb18-406f-a0e1-db00fcf...), watercolor backgrounds (https://accomplice.ai/models/91b8a080-faca-4ff4-8b11-64b0789...), etc…

So theoretically if there were a searchable marketplace of 100s of different finetuned models people could choose from, they would use it much like an iStockPhoto and be able to create the kind of images they want instead of just downloading them.

But it's of course a constant work in progress. Slowly growing though and lots of promising stuff ahead!

urthor · 4 years ago
It's also because these companies hire away staff all the time.

Rule 1 of working in this field, recruit a mid level member of the other company's research group biannually to get the latest gossip.

It's impossible to keep a 1 page or shorter "algorithm" secret, when the creators are geniuses and they hop jobs every year or so.

Fellas with more IQ than games in a baseball season, they just don't forget.

gfodor · 4 years ago
Seems exactly false. DALL-E 2 seems to end much of the illustrator industry and if the endless array of Twitter posts from early adopters are any indication it works great.
minimaxir · 4 years ago
Publishing high impact research gives credibility to the ML teams, which helps recruiting and prestige.

It's less cynical, more incentive alignment.

hackernewds · 4 years ago
Also good for society. Less evil, more nice people. Respect Google for these traits.
dotnet00 · 4 years ago
I think there are three main reasons why:

- being open is kind of just how things in ML generally work right now, it's in stark contrast to things like chemistry or physics where paywalls are pretty common

- it's a matter of clout, ML is moving ridiculously quickly, with work from just 5 years ago being considered outdated in terms of capability, if you don't publish, someone else will and they'll get the credit. This likely also matters for the researchers since they get credit too. In a sense this is just publish or perish culture from academia.

- it's also somewhat about hiring, which is related to the clout. By putting out this kind of research, they're attracting talented engineers to consider working for them. This of course is pretty relevant to the rest of their business, especially given how heavily Google leans on AI to handle moderation.

nl · 4 years ago
> Public companies can't do stuff just for the fun of it, right? So there must be some commercial reasoning behind it?

Yes they can actually.

If you are a shareholder you can either sue (unlikely to succeed) or vote against the board. That's pretty much the only recourse.

pmoriarty · 4 years ago
"If you are a shareholder you can either sue (unlikely to succeed) or vote against the board. That's pretty much the only recourse."

Or you could sell or just threaten to sell your shares.

Buying more of the company's shares to take it over is another option.

Good luck doing that with Google.

blueblob · 4 years ago
I think it is because the proprietary part for them is the data, not the particular algorithm. They benefit more from other people making advances on their technology because they have the data to get more benefit than anyone else. If they kept it to themselves, they would get no "free" advancement. So they trade-off the secret of the technique in the hope that others will advance the technique, making their data more valuable.
curiousgal · 4 years ago
What would the product be? My experience is that most ML papers are very brittle, for every cool result/example you see there's a plethora of nonsense spit by the models.
pyinstallwoes · 4 years ago
Absolute power, corrupts absolutely. Perhaps it’s a game theoretic approach to leveling the playing field.
vintermann · 4 years ago
This is an important question.

You're saying it's Google who have done this research. In a way that's true. But really it is Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S. Sara Mahdavi, Rapha Gontijo Lopes, Tim Salimans, Jonathan Ho, David J Fleet and Mohammad Norouzi who did it, with material support from Google.

It's likely that some or all of these people would have refused to do the work they do if Google kept it all as their secret sauce.

And moreover, there are excellent reasons why they wouldn't want to. It's not just the obvious that if it all were secret, they wouldn't be able to use it in their non-Google career advancement. It's also that research without the freedom to talk is far more difficult and frustrating.

On paper, scientific papers are supposed to document the whole of the discovery/innovation. So you might think that an insider, who got to read all the secret Google research papers AND all the public ones would have an advantage. But problem is, even the best written papers with full code and comments inevitably leave out things, especially of the "why this and not that" type.

If you're a researcher in the free world, you can just ask. Especially if you have a public track record of great papers yourself, they will WANT to talk to you. You can learn so much more from the interactive process of back and forth questions than you can from a static piece of information like a scientific paper.

If you work for a secretive and command-driven organization, you need to be careful about what you reveal of your own research when you ask. You can't talk freely. The thought of having to justify your communication to some old-school IBM lawyer type, is going to chill even the most enthusiastic reseacher. It's easier to just stay in your own corporate bubble, and focus on the things your corporation does well since at least you can talk freely to your colleagues (although in really paranoid organizations like the NSA or old IBM, even that may not be true). But then at best you specialize, at worst you fall behind.

dekhn · 4 years ago
Google started publishing for several reasons but the primary one was recruitment (showing off was a secondary goal). The mapreduce, GFS, and bigtable papers played an important role in attracting an early generation of distributed computing/high performance computing people from around the valley and CMU/MIT, who helped build the second really successful versions of the web search engine (retrieval and ranking), ads serving (the auction, the logs joining pipeline), etc.

The other reason is that the leaders at Google at the time believed that we would achieve the singularity faster if Jeff Dean periodically sent ideas back 10 years in time to Doug Cutting.

frozenport · 4 years ago
1. This finding is hard to monetize, especially with ROI that Google typically does (for example a an app that makes $500 a month isn't worth it)

2. Deploying models in a cost effective way is hard

3. Lessons learned from building this model can indeed be monetized and many of them may be kept secret.

toxik · 4 years ago
It’s simply flag planting. If they don’t do it, some other player will.
yaroslavvb · 4 years ago
Researchers like to talk about and show off their work outside the company. If you don't let them, they get unhappy and leave.
quickthrower2 · 4 years ago
It could be classed as a way to attract talent. It is basically then a very expensive fusball table.
29athrowaway · 4 years ago
Because the competitive advantage is also in the training datasets and ML infrastructure.
mzs · 4 years ago
You realize this doesn't include the trained data, right?
DeathArrow · 4 years ago
They did it out of goodness of their hearts. :)
dang · 4 years ago
Recent and related:

DALL-E 2 open source implementation - https://news.ycombinator.com/item?id=31228710 - May 2022 (152 comments)

Also:

X-Transformers: A fully-featured transformer with experimental features - https://news.ycombinator.com/item?id=27089208 - May 2021 (37 comments)

Text to Image Generation - https://news.ycombinator.com/item?id=26615791 - March 2021 (88 comments)

bufferoverflow · 4 years ago
How much would it cost to train something like this? Is there even a good dataset for it?
Yenrabbit · 4 years ago
There is a dataset of 5 billion image-text pairs (laion-5b) scraped by various parties. This can then be filtered and used to train these models. Cost is a bit of an issue but there are orgs that have provided compute for open model training. And Imagen is nice because the text encoder part is already available and doesn't need more training, so it would just be the diffusion model components being trained. I'd guess we'll see a biggish training run starting in a few weeks.
visarga · 4 years ago
I hope so. It's a bit cruel to show off and then lock it away.
dharma1 · 4 years ago
When you say a bit of an issue, how much are we talking?
webmaven · 4 years ago
Are there prefiltered derivatives of Laion-5B available? I can imagine various contraindicated categories you might want to avoid entirely, as well as biases you might want to adjust for by balancing classes in the data (5 billion images gives you a lot of room to balance the dataset).
srcreigh · 4 years ago
Who will do the training? Will the training results be made available publicly? why wont it start until a few weeks from now?
dirtyid · 4 years ago
I feel like the fact big porn hasn't poached talent and jumped all over this suggest at least 10s of millions. That said some for profit no-rules deepfake service for disinformation and illegal content has to be in the works.
tomatowurst · 4 years ago
There's a company in Montreal that makes that in a month and also has access to copious amount of said datasets on their servers. It may or may not be that they already have engineers on it. We have no way of knowing since its a private company
jonathankoren · 4 years ago
Of course the implementation isn’t the issue. It’s the training data and the compute machines.

Open source is pretty meaningless here

jfoster · 4 years ago
In fact, this kind of reverses things, doesn't it?

Open source is built on the assumption that you can do more with source code than with binaries. In the case of AI models, the computed weights of models are what's valuable, and the source code used to achieve them is less useful.

aero-glide2 · 4 years ago
How much would it cost in training to match dall-e 2?
emadm · 4 years ago
Images actually used our open LAION 400m dataset plus 400m of their own.

On compute there is more than enough compute available to open source now via LAION and Eleuther AI to train these models, will just a bit of time.

29athrowaway · 4 years ago
I guess the stock photos industry will be disrupted by this.
quickthrower2 · 4 years ago
r/wtfstockphotos will be full of this

And medium posts about React hooks and Go generics are gonna be full of “Animated Drake meme but with Rick Astley” kind of fun.

karmasimida · 4 years ago
How big is getty? I think this is a billion dollars business.
benstrumental · 4 years ago
Definitely. With the right training set, video game assets too. Imagine being able to generate N variations of any asset in any video game style...
seydor · 4 years ago
They can make more profit if they don't need to pay artists
srvmshr · 4 years ago
Jesus! Most of us have their Github contributions map look like Midwest cornfields. LucidRain's looks like downtown Manhattan

https://skyline.github.com/lucidrains/2021

throwaway12245 · 4 years ago
I did the pip installs and installed Cuda. I changed prompt and ran the sample code. It ran to completion. How do I save the image from trainer.sample?
visarga · 4 years ago
You'd have to train it first.