Gemini AI - Readit News

Dataset | Gemini Ultra | Gemini Pro | GPT-4 MMLU | 90 | 79 | 87 BIG-Bench-Hard | 84 | 75 | 83 HellaSwag | 88 | 85 | 95 Natural2Code | 75 | 70 | 74 WMT23 | 74 | 72 | 74

This demo is nuts: https://youtu.be/UIZAiXYceBI?si=8ELqSinKHdlGlNpX

dfbrown · 2 years ago

How real is it though? This blog post says

In this post, we’ll explore some of the prompting approaches we used in our Hands on with Gemini demo video.

which makes it sound like they used text + image prompts and then acted them out in the video, as opposed to Gemini interpreting the video directly.

https://developers.googleblog.com/2023/12/how-its-made-gemin...

riscy · 2 years ago

After reading this blog post, that hands-on video is just straight-up lying to people. For the boxcar example, the narrator in the video says to Gemini:

> Narrator: "Based on their design, which of these would go faster?"

Without even specifying that those are cars! That was impressive to me, that it recognized the cars are going downhill _and_ could infer that in such a situation, aerodynamics matters. But the blog post says the real prompt was this:

> Real Prompt: "Which of these cars is more aerodynamic? The one on the left or the right? Explain why, using specific visual details."

They narrated inaccurate prompts for the Sun/Saturn/Earth example too:

> Narrator: "Is this the right order?"

> Real Prompt: "Is this the right order? Consider the distance from the sun and explain your reasoning."

If the narrator actually read the _real_ prompts they fed Gemini in these videos, this would not be as impressive at all!

crdrost · 2 years ago

Yeah I think this comment basically sums up my cynicism about that video.

It's that, you know some of this happened and you don't know how much. So when it says "what the quack!" presumably the model was prompted "give me answers in a more fun conversational style" (since that's not the style in any of the other clips) and, like, was it able to do that with just a little hint or did it take a large amount of wrangling "hey can you say that again in a more conversational way, what if you said something funny at the beginning like 'what the quack'" and then it's totally unimpressive. I'm not saying that's what happened, I'm saying "because we know we're only seeing a very fragmentary transcript I have no way to distinguish between the really impressive version and the really unimpressive one."

It'll be interesting to use it more as it gets more generally available though.

calvinv · 2 years ago

It's always like this isn't it. I was watching the demo and thought why ask it what duck is in multiple languages? Siri can do that right now and it's not an ai model. I really do think we're getting their with the ai revolution but these demos are so far from exciting, they're just mundane dummy tasks that don't have the nuance of everything we really interact and would need help from an ai with

huytersd · 2 years ago

How do you know though? The responses in the video were not the same as those in the blog post.

ACS_Solver · 2 years ago

To quote Gemini, what the quack! Even with the understanding that these are handpicked interactions that are likely to be among the system's best responses, that is an extremely impressive level of understanding and reasoning.

CamperBob2 · 2 years ago

Calls for a new corollary to Clarke's Third Law. "Any sufficiently-advanced rigged demo is indistinguishable from magic."

quackery1 · 2 years ago

Does it really need to have affectations like "What the quack!"? These affectations are lab grown and not cute.

spaceman_2020 · 2 years ago

What would be Gemini's current IQ? I would suspect it's higher than the average human's.

spaceman_2020 · 2 years ago

I'm legitimately starting to wonder what white collar workers will even do in 5-10 years.

This just Year 1 of this stuff going mainstream. Careers are 25-30 years long. What will someone entering the workforce today even be doing in 2035?

VirusNewbie · 2 years ago

Even if we get Gemini 2.0 or GPT-6 that is even better at the stuff it's good at now... you've always been able to outsource 'tasks' for cheap. There is no shortage of people that can write somewhat generic text, write chunks of self contained code, etc.

This might lower the barrier of entry but it's basically a cheaper outsourcing model. And many companies will outsource more to AI. But there's probably a reason that most large companies are not just managers and architects who farm out their work to the cheapest foreign markets.

Similar to how many tech jobs have gone from C -> C++ -> Java -> Python/Go, where the average developer is supposd to accomplish a lot more than perviously, I think you'll see the same for white collar workers.

Software engieneering didn't die because you needed so much less work to do a network stack, the expectations changed.

This is just non technical white collar worker's first level up from C -> Java.

VikingCoder · 2 years ago

[Guy who draws blue ducks for a living]: DAMNIT!

Barrin92 · 2 years ago

>What will someone entering the workforce today even be doing in 2035?

The same thing they're doing now, just with tools that enable them to do some more of it. We've been having these discussions a dozen times, including pre- and post computerization and every time it ends up the same way. We went from entire teams writing Pokemon in Z80 assembly to someone cranking out games in Unity while barely knowing to code, and yet game devs still exist.

moffkalast · 2 years ago

Yeah it has been quite the problem to think about ever since the original release of ChatGPT, as it was already obvious where this will be going and multimodal models more or less confirmed it.

There's two ways this goes: UBI or gradual population reduction through unemployment and homelessness. There's no way the average human will be able to produce any productive value outside manual labor in 20 years. Maybe not even that, looking at robots like Digit that can already do warehouse work for $25/hour.

TrackerFF · 2 years ago

Yes, imagine being a HS student now, deciding what to do 5-6-7 years from now.

arvinsim · 2 years ago

Work will just move to a higher level of abstraction.

drubio · 2 years ago

I'm wondering the same, but for the narrower white collar subset of tech workers, what will today's UX/UI designer or API developer be doing in 5-10 years.

butlike · 2 years ago

Whatever you want, probably. Or put a different way: "what's a workforce?"

"We need to do a big calculation, so your HBO/Netflix might not work correctly for a little bit. These shouldn't be too frequent; but bear with us."

Go ride a bike, write some poetry, do something tactile with feeling. They're doing something, but after a certain threshold, us humans are going to have to take them at their word.

The graph of computational gain is going to go linear, quadratic, ^4, ^8, ^16... all the way until we get to it being a vertical line. A step function. It's not a bad thing, but it's going to require a perspective shift, I think.

Edit: I also think we should drop the "A" from "AI" ...just... "Intelligence."

gniv · 2 years ago

Yeah, this feels like the revenge of the blue collar workers. Maybe the changes won't be too dramatic, but the intelligence premium will definitely go down.

Ironically, this is created by some of the most intelligent people.

samr71 · 2 years ago

We're just gonna have UBI

dblitt · 2 years ago

> For the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity.

Seems like this video was heavily editorialized, but still impressive.

nathanfig · 2 years ago

Definitely edited, pretty clear in some of the transitions. Makes me wonder how many takes were needed.

andrewprock · 2 years ago

The prompts were also likely different:

video: "Is this the right order?"

blog post: "Is this the right order? Consider the distance from the sun and explain your reasoning."

https://developers.googleblog.com/2023/12/how-its-made-gemin...

EZ-E · 2 years ago

Out of curiosity I fed ChatGPT 4 a few of the challenges through a photo (unclear if Gemini takes live video feed as input but GPT does not afaik) and it did pretty well. It was able to tell a duck was being drawn at an earlier stage before Gemini did. Like Gemini it was able to tell where the duck should go - to the left path to the swan. Because and I quote "because ducks and swans are both waterfowl, so the swan drawing indicates a category similarity (...)"

nuccy · 2 years ago

Gemini made a mistake, when asked if the rubber duck floats, it says (after squeaking comment): "it is a rubber duck, it is made of a material which is less dense than water". Nope... rubber is not less dense (and yes, I checked after noticing, rubber duck is typically made of synthetic vinyl polymer plastic [1] with density of about 1.4 times the density of water, so duck floats because of air-filled cavity inside and not because of material it is made of). So it is correct conceptually, but misses details or cannot really reason based on its factual knowledge.

P.S. I wonder how these kind of flaws end up in promotions. Bard made a mistake about JWST, which at least is much more specific and is farther from common knowledge than this.

1. https://ducksinthewindow.com/rubber-duck-facts/

kolinko · 2 years ago

I showed the choice between a bear and a duck to GPT4, and it told me that it depends on whether the duck wants to go to a peaceful place, or wants to face a challenge :D

z7 · 2 years ago

Tried the crab image. GPT-4 suggested a cat, then a "whale or a similar sea creature".

bookmark1231 · 2 years ago

The category similarity comment is amusing. My ChatGPT4 seems to have an aversion to technicality, so much that I’ve resorted to adding “treat me like an expert researcher and don’t avoid technical detail” in the prompt

thunkshift1 · 2 years ago

They should do this live instead of a pre recorded video for it to be more awe inspiring. Googles hype machine cannot be trusted.

galaxyLogic · 2 years ago

Right. I would hope that competition does such live demonstration of where it fails. But I guess they won't because that would be bad publicity for AI in general.

kolinko · 2 years ago

+1. Or at least with no cuts, and more examples.

This is obviously geared towards non-technical/marketing people that will catch on to the hype. Or towards wall street ;)

brrrrrm · 2 years ago

I once met a Google PM whose job was to manage “Easter eggs” in the Google home assistant. I wonder how many engineers effectively “hard coded” features into this demo. (“What the quack” seems like one)

rvnx · 2 years ago

Probably not "hard coded" in the literal way, but instead, if the model is using RLHF, they could thumbs up the answer.

haxiomic · 2 years ago

Curious how canned this demo is, in the last scene the phone content rotates moments before the guy rotates it so its clearly scripted

I suspect the cutting edge systems are capable of this level but over-scripting can undermine the impact

SamBam · 2 years ago

Wow, that is jaw-dropping.

I wish I could see it in real time, without the cuts, though. It made it hard to tell whether it was actually producing those responses in the way that is implied in the video.

natsucks · 2 years ago

right. if that was real time, the latency was very impressive. but i couldn't tell.

drubio · 2 years ago

All the implications, from UI/UX to programming in general.

Like how much of what was 'important' to develop a career in the past decades, even in the past years, will be relevant with these kinds of interactions.

I'm assuming the video is highly produced, but it's mind blowing even if 50% of what the video shows works out of the gate and is as easy as it portrays.

globular-toast · 2 years ago

It seems weird to me. He asked it to describe what it sees, why does it randomly start spouting irrelevant facts about ducks? And is it trying to be funny when it's surprised about the blue duck? Does it know it's trying to be funny or does it really think it's a duck?

I can't say I'm really looking forward to a future where learning information means interacting with a book-smart 8 year old.

u320 · 2 years ago

Yeah it's weird why they picked this as a demo. The model could not identify an everyday item like a rubber duck? And it doesn't understand Archimedes' principle, instead reasoning about the density of rubber?

w10-1 · 2 years ago

It's a very smooth demo, for demo's sake.

So the killer app for AI is to replace Where's Waldo? for kids?

Or perhaps that's the fun, engaging, socially-acceptable marketing application.

I'm looking for the demo that shows how regular professionals can train it to do the easy parts of their jobs.

That's the killer app.

fragmede · 2 years ago

Regular professionals that spend any time with text; sending emails, recieving mails, writing paragraphs of text for reports, reading reports, etc; all of that is now easier. Instead of taking thirty minutes to translate an angry email to a client where you want to say "fuck you, pay me", you can run it through an LLM and have it translated into professional business speak, and send out all of those emails before lunch, instead of spending all day writing instead. Same on the recieving side as well. Just ask an LLM to summarize the essay of an email to you in bullet points, and save yourself the time reading.

konschubert · 2 years ago

There are many answers and each is a company.

kromem · 2 years ago

The multimodal capabilities are, but the tone and insight comes across as very juvenile compared to the SotA models.

I suspect this was a fine tuning choice and not an in context level choice, which would be unfortunate.

If I was evaluating models to incorporate into an enterprise deployment, "creepy soulless toddler" isn't very high up on the list of desired branding characteristics for that model. Arguably I'd even have preferred histrionic Sydney over this, whereas "sophisticated, upbeat, and polite" would be the gold standard.

While the technical capabilities come across as very sophisticated, the language of the responses themselves do not at all.

avs733 · 2 years ago

honestly - of all the AI hype demos and presentations recently - this is the first one that has really blown my mind. Something about the multimodal component of visual to audio just makes it feel realer. I would be VERY curious to see this live and in real time to see how similar it is to the video.

wseqyrku · 2 years ago

you haven't seen pika then.

gpmcadam · 2 years ago

This is a product marketing video, not a demo.

danpalmer · 2 years ago

I literally burst out laughing at the crab.

bogtog · 2 years ago

The crab was the most amazing part of the demo for me.

mandarlimaye · 2 years ago

Google needs to pay someone to come up with better demos. Atleast this one is 100x better than the talking to pluto dumb demo they came up with few years ago.

jeron · 2 years ago

It’s technically very impressive but the question is how many people will use the model in this way? Does Gemini support video streaming?

WXLCKNO · 2 years ago

In 5 years having a much more advanced version of this on a Google Glass like device would be amazing.

Real time instructions for any task, learn piano, live cooking instructions, fix your plumbing etc.

relativeadv · 2 years ago

its quacktastic

https://www.youtube.com/watch?app=desktop&v=kp2skYYA2B4

jansan · 2 years ago

They should call it "Sheldon".

nuz · 2 years ago

This makes me excited about the future

RGamma · 2 years ago

Let's hope we're in the 0.0001% when things get serious. Otherwise it'll be the wagie existence for us (or whatever the corporate overlords have in mind then).

Technically still exciting, just in the survival sense.

One observation: Sundar's comments in the main video seem like he's trying to communicate "we've been doing this ai stuff since you (other AI companies) were little babies" - to me this comes off kind of badly, like it's trying too hard to emphasize how long they've been doing AI (which is a weird look when the currently publicly available SOTA model is made by OpenAI, not Google). A better look would simply be to show instead of tell.

In contrast to the main video, this video that is further down the page is really impressive and really does show - the 'which cup is the ball in is particularly cool': https://www.youtube.com/watch?v=UIZAiXYceBI.

Other key info: "Integrate Gemini models into your applications with Google AI Studio and Google Cloud Vertex AI. Available December 13th." (Unclear if all 3 models are available then, hopefully they are, and hopefully it's more like OpenAI with many people getting access, rather than Claude's API with few customers getting access)

dontupvoteme · 2 years ago

He's not wrong. DeepMind spends time solving big scientific / large-scale problems such as those in genetics, material science or weather forecasting, and Google has untouchable resources such as all the books they've scanned (and already won court cases about)

They do make OpenAI look like kids in that regard. There is far more to technology than public facing goods/products.

It's probably in part due to the cultural differences between London/UK/Europe and SiliconValley/California/USA.

freetanga · 2 years ago

While you are spot on, I cannot avoid thinking of 1996 or so.

On one corner: IBM Deep Blue winning vs Kasparov. A world class giant with huge research experience.

On the other corner, Google, a feisty newcomer, 2 years in their life, leveraging the tech to actually make something practical.

Is Google the new IBM?

roguas · 2 years ago

Oh it's good they working on important problems with their ai. Its just openai was working on my/our problems (or providing tools to do so) and that's why people are more excited about them. Not because of cultural differences. If you are more into weather forecasting, yeah it sure may be reasonable to prefer google more.

jahsome · 2 years ago

That statement isn't really directed at the people who care about the scientific or tech-focused capabilities. I'd argue the majority of those folks interested in those things already know about DeepMind.

This statement is for the mass market MBA-types. More specifically, middle managers and dinosaur executives who barely comprehend what generative AI is, and value perceived stability and brand recognition over bleeding edge, for better or worse.

I think the sad truth is an enormous chunk of paying customers, at least for the "enterprise" accounts, will be generating marketing copy and similar "biz dev" use cases.

michaelt · 2 years ago

> They do make OpenAI look like kids in that regard.

Nokia and Blackberry had far more phone-making experience than Apple when the iPhone launched.

But if you can't bring that experience to bear, allowing you to make a better product - then you don't have a better product.

chatmasta · 2 years ago

Great. But school's out. It's time to build product. Let the rubber hit the road. Put up or shut up, as they say.

I'm not dumb enough to bet against Google. They appear to be losing the race, but they can easily catch up to the lead pack.

There's a secondary issue that I don't like Google, and I want them to lose the race. So that will color my commentary and slow my early adoption of their new products, but unless everyone feels the same, it shouldn't have a meaningful effect on the outcome. Although I suppose they do need to clear a higher bar than some unknown AI startup. Expectations are understandably high - as Sundar says, they basically invented this stuff... so where's the payoff?

jazzyjackson · 2 years ago

Damn I totally forgot Google actually has rights over its training set, good point, pretty much everybody else is just bootlegging it.

peyton · 2 years ago

I think Apple (especially under Jobs) had it right that customers don’t really give a shit about how hard or long you’ve worked on a problem or area.

Deleted Comment

bufferoverflow · 2 years ago

They do not make Openai look like kids. If anything, it looks like they spent more time, but achieved less. GPT-4 is still ahead of anything Google has released.

foruhar · 2 years ago

From afar it seems like the issues around Maven caused Google to pump the brakes on AI at just the wrong moment with respect to ChatGPT and bringing AI to market. I’m guessing all of the tech giants, and OpenAI, are working with various defense departments yet they haven’t had a Maven moment. Or maybe they have and it wasn’t in the middle of the race for all the marbles.

scotty79 · 2 years ago

> They do make OpenAI look like kids in that regard.

It makes Google look like old fart that wasted his life and didn't get anywhere and now he's bitter about kids running on his lawn.

shutupnerd0000 · 2 years ago

Nobody said he's wrong. Just that it's a bad look.

tahoeskibum · 2 years ago

I thought that Google was based out of Silcon Valley/California/USA

xipho · 2 years ago

> and Google has untouchable resources such as all the books they've scanned (and already won court cases about)

https://www.hathitrust.org/ has that corpus, and its evolution, and you can propose to get access to it via collaborating supercomputer access. It grows very rapidly. InternetArchive would also like to chat I expect. I've also asked, and prompt manipulated chatGPT to estimate the total books it is trained with, it's a tiny fraction of the corpus, I wonder if it's the same with Google?

lkbm · 2 years ago

It's worth remembering that AI is more than LLMs. DeepMind is still doing big stuff: https://deepmind.google/discover/blog/millions-of-new-materi...

phi0 · 2 years ago

I just want to underscore that. DeepMind's research output within the last month is staggering:

2023-11-14: GraphCast, word leading weather prediction model, published in Science

2023-11-15: Student of Games: unified learning algorithm, major algorithmic breath-through, published in Science

2023-11-16: Music generation model, seemingly SOTA

2023-11-29: GNoME model for material discovery, published in Nature

2023-12-06: Gemini, the most advanced LLM according to own benchmarks

dpflan · 2 years ago

Indeed, I would think the core search product as another example of ai/ml...

bogwog · 2 years ago

> Sundar's comments in the main video seem like he's trying to communicate "we've been doing this ai stuff since you (other AI companies) were little babies" - to me this comes off kind of badly

Reminds me of the Stadia reveal, where the first words out of his mouth were along the lines of "I'll admit, I'm not much of a gamer"

This dude needs a new speech writer.

cmrdporcupine · 2 years ago

This dude needs a new speech writer.

How about we go further and just state what everyone (other than Wall St) thinks: Google needs a new CEO.

One more interested in Google's supposed mission ("to organize the world's information and make it universally accessible and useful"), than in Google's stock price.

thefourthchime · 2 years ago

Dude needs a new job. He's been the Steve Balmer of Google, ruining what made them great and running the company into the ground.

supportengineer · 2 years ago

>> This dude needs a new speech writer.

If only there was some technology that could help "generate" such text.

tikkun · 2 years ago

To add to my comment above: Google DeepMind put out 16 videos about Gemini today, the total watch time at 1x speed is about 45 mins. I've now watched them all (at >1x speed).

In my opinion, the best ones are:

* https://www.youtube.com/watch?v=UIZAiXYceBI - variety of video/sight capabilities

* https://www.youtube.com/watch?v=JPwU1FNhMOA - understanding direction of light and plants

* https://www.youtube.com/watch?v=D64QD7Swr3s - multimodal understanding of audio

* https://www.youtube.com/watch?v=v5tRc_5-8G4 - helping a user with complex requests and showing some of the 'thinking' it is doing about what context it does/doesn't have

* https://www.youtube.com/watch?v=sPiOP_CB54A - assessing the relevance of scientific papers and then extracting data from the papers

My current context: API user of OpenAI, regular user of ChatGPT Plus (GPT-4-Turbo, Dall E 3, and GPT-4V), occasional user of Claude Pro (much less since GPT-4-Turbo with longer context length), paying user of Midjourney.

Gemini Pro is available starting today in Bard. It's not clear to me how many of the super impressive results are from Ultra vs Pro.

Overall conclusion: Gemini Ultra looks very impressive. But - the timing is disappointing: Gemini Ultra looks like it won't be widely available until ~Feb/March 2024, or possibly later.

> As part of this process, we’ll make Gemini Ultra available to select customers, developers, partners and safety and responsibility experts for early experimentation and feedback before rolling it out to developers and enterprise customers early next year.

> Early next year, we’ll also launch Bard Advanced, a new, cutting-edge AI experience that gives you access to our best models and capabilities, starting with Gemini Ultra.

I hope that there will be a product available sooner than that without a crazy waitlist for both Bard Advanced, and Gemini Ultra API. Also fingers crossed that they have good data privacy for API usage, like OpenAI does (i.e. data isn't used to train their models when it's via API/playground requests).

tikkun · 2 years ago

My general conclusion: Gemini Ultra > GPT-4 > Gemini Pro

See Table 2 and Table 7 https://storage.googleapis.com/deepmind-media/gemini/gemini_... (I think they're comparing against original GPT-4 rather than GPT-4-Turbo, but it's not entirely clear)

What they've released today: Gemini Pro is in Bard today. Gemini Pro will be coming to API soon (Dec 13?). Gemini Ultra will be available via Bard and API "early next year"

Therefore, as of Dec 6 2023:

SOTA API = GPT-4, still.

SOTA Chat assistant = ChatGPT Plus, still, for everything except video, where Bard has capabilities . ChatGPT plus is closely followed by Claude. (But, I tried asking Bard a question about a youtube video today, and it told me "I'm sorry, but I'm unable to access this YouTube content. This is possible for a number of reasons, but the most common are: the content isn't a valid YouTube link, potentially unsafe content, or the content does not have a captions file that I can read.")

SOTA API after Gemini Ultra is out in ~Q1 2024 = Gemini Ultra, if OpenAI/Anthropic haven't released a new model by then

SOTA Chat assistant after Bard Advanced is out in ~Q1 2024 = Bard Advanced, probably, assuming that OpenAI/Anthropic haven't released new models by then

guiomie · 2 years ago

Watching these videos made me remember this cool demo Google did years ago where their earpods would auto translate in realtime a conversation between two people talking different languages. Turned out to be demo vaporware. Will this be the same thing?

rtsil · 2 years ago

When I watch any of these videos, all the related videos on my right sidebar are from Google, 16 of which were uploaded at the same time as the one I'm watching.

I've never seen the entire sidebar filled with the videos of a single channel before.

chatmasta · 2 years ago

Wait so it doesn't exist yet? Thanks for watching 45 minutes of video to figure that out for me. Why am I wasting my time reading this thread?

Somebody please wake me up when I can talk to the thing by typing and dropping files into a chat box.

Deleted Comment

Dead Comment

cowsup · 2 years ago

> to me this comes off kind of badly, like it's trying too hard to emphasize how long they've been doing AI

These lines are for the stakeholders as opposed to consumers. Large backers don't want to invest in a company that has to rush to the market to play catch-up, they want a company that can execute on long-term goals. Re-assuring them that this is a long-term goal is important for $GOOG.

hinkley · 2 years ago

Large backers and stakeholders are not 25 years old.

gessha · 2 years ago

It would be interesting to write a LLM query to separate speech details based on target audience: stakeholders, consumers, etc.

headcanon · 2 years ago

Its a conceit but not unjustified, they have been doing "AI" since their inception. And yeah, Sundar's term up until recently seems to me to be milking existing products instead of creating new ones, so it is a bit annoying when they act like this was their plan the whole time.

Google's weakness is on the product side, their research arm puts out incredible stuff as other commenters have pointed out. GPT essentially came out from Google researchers that were impatient with Google's reluctance to ship a product that could jeopardize ad revenue on search.

nonethewiser · 2 years ago

The point is if you have to remind people then you’re doing something wrong. The insight to draw from this is not that everyone else is misinformed about googles abilities (the implication), its that Google has not capitalized on their resources.

radicaldreamer · 2 years ago

It's such a short sighted approach too because I'm sure someone will develop a GPT with native advertising and it'll be a blockbuster because it'll be free to use but also have strong revenue generating potential.

misterbwong · 2 years ago

I also find that tone a bit annoying but I'm OK with it because it highlights how these types of bets, without an immediate benefit, can pay off very well in the long term, even for huge companies like Google. AI, as we currently know it, wasn't really a "thing" when Google started with it and the payoff wasn't clear. They've long had to defend their use of their own money for big R&D bets like this and only now is it really clearly "adding shareholder value".

Yes, I know it was a field of interest and research long before Google invested, but the fact remains that they _did_ invest deeply in it very early on for a very long time before we got to this point.

Their continued investment has helped push the industry forward, for better or worse. In light of this context, I'm ok with them taking a small victory lap and saying "we've been here, I told you it was important".

jeffbee · 2 years ago

> only now is it really clearly "adding shareholder value".

AI has been adding a huge proportion of the shareholder value at Google for many years. The fact that their inference systems are internal and not user products might have hidden this from you.

Deleted Comment

lossolo · 2 years ago

> we've been doing this ai stuff since you (other AI companies) were little babies

Actually, they kind of did. What's interesting is that they still only match GPT-4's version but don't propose any architectural breakthroughs. From an architectural standpoint, not much has changed since 2017. The 'breakthroughs', in terms of moving from GPT to GPT-4, included: adding more parameters (GPT-2/3/4), fine-tuning base models following instructions (RLHF), which is essentially structured training (GPT-3.5), and multi-modality, which involves using embeddings from different sources in the same latent space, along with some optimizations that allowed for faster inference and training. Increasing evidence suggests that AGI will not be attainable solely using LLMs/transformers/current architecture, as LLMs can't extrapolate beyond the patterns in their training data (according to a paper from DeepMind last month):

"Together our results highlight that the impressive ICL abilities of high-capacity sequence models may be more closely tied to the coverage of their pretraining data mixtures than inductive biases that create fundamental generalization capabilities."[1]

1. https://arxiv.org/abs/2311.00871

alaskamiller · 2 years ago

In short: a chat bot is not AI.

hinkley · 2 years ago

Sundar studied material science in school and is only slightly older than me. Google is a little over 25 years old. I guarantee you they have not been doing AI since I was a baby.

And how many financial people worth reconning with are under 30 years old? Not many.

crossroadsguy · 2 years ago

Unless you are OpenAI, the company, I doubt OP implied it was aimed at you. But then I wouldn't know as I am much younger than Sundar Pichai and I am not on first name basis with him either ;-)

mattmaroon · 2 years ago

I do think that’s a backfire. Telling me how long you’ve been doing something isn’t that impressive if the other guy has been doing it for much less time and is better at it. It’s in fact the opposite.

pb7 · 2 years ago

Not if the little guy leveraged your inventions/research.

infoseek12 · 2 years ago

> "we've been doing this ai stuff since you (other AI companies) were little babies"

Well in fairness he has a point, they are starting to look like a legacy tech company.

ugh123 · 2 years ago

> One observation: Sundar's comments in the main video seem like he's trying to communicate "we've been doing this ai stuff since you (other AI companies)

Sundar has been saying this repeatedly since Day 0 of the current AI wave. It's almost cliche for him at this point.

dragonwriter · 2 years ago

And he's going to keep saying it to tell investors why they should believe Google will eventually catch up in product until Google does catch up in product and he doesn't need to say it anymore.

Or until Google gives up on the space, or he isn't CEO, if either of those come first, which I wouldn't rule out.

xnx · 2 years ago

Sundar announced his intentions to lead Google as an "AI first" company in May 2017: https://blog.google/technology/ai/making-ai-work-for-everyon...

FrustratedMonky · 2 years ago

Well, deepmind was doing amazing stuff before OpenAI.

AlphaGo, AlphaFold, AlphaStar.

They were groundbreaking a long time ago. They just happened to miss the LLM surge.

schleck8 · 2 years ago

They always do this, every time they get to mention AI. It appears somewhat desperate imo.

jiggawatts · 2 years ago

That was pretty impressive… but do I have to be “that guy” and point out the error it made?

It said rubber ducks float because they’re made of a material less dense than water — but that’s not true!

Rubber is more dense than water. The ducky floats because it’s filled with air. If you fill it with water it’ll sink.

Interestingly, ChatGPT 3.5 makes the same error, but GPT 4 nails it and explains the it’s the air that provides buoyancy.

I had the same impression with Google’s other AI demos: cute but missing something essential that GPT 4 has.

scoot · 2 years ago

I spotted that too, but also, it didn't recognise the "bird" until it had feet, when it is supposedly better than a human expert. I don't doubt that the examples were cherry-picked, so if this is the best it can do, it's not very convincing.

zyxin · 2 years ago

I would've liked to see an explanation that includes the weight of water being displaced. That would also explain how a steel ship with an open top is also able to float.

StevenNunez · 2 years ago

This demo is blowing my mind! It's really incredible. Can't wait to play around with them.

smoldesu · 2 years ago

In fairness, the performance/size ratio for models like BERT still gives GPT-3/4 and even Llama a run for it's money. Their tech isn't as product-ized as OpenAI's, but Tensorflow and it's ilk have been an essential part of driving actual AI adoption. The people I know in the robotics and manufacturing industries are forever grateful for the out-front work Google did to get the ball rolling.

wddkcs · 2 years ago

You seem to be saying the same thing- Googles best work is in the past, their current offerings are underwhelming, even if foundational to the progress of others.

ac1spkrbox · 2 years ago

“Any man who must say ‘I am the king’ is no true King”

DonHopkins · 2 years ago

Any man who must say "I won't be a dictator, except for day one" will be a permanent dictator.

https://eu.usatoday.com/story/news/politics/elections/2023/1...

corethree · 2 years ago

Didn't Google invent LLMs and didn't Google have an internal LLm with similar capabilities long before openai released the gpts? Remember when that guy got fired for making a claim it was conscious ?

The look isn't good. But it's not dishonest.

ma2rten · 2 years ago

No this is not correct. Arguably OpenAI invented LLMs with GPT3 and the preceding scaling laws paper. I worked on LAMDA, it came after GPT4 and was not as capable. Google did invent the transformer, but all the authors of the paper have left since.

OJFord · 2 years ago

Incredible stuff, and yet TTS is still so robotic. Frankly I assume it must be deliberate at this point, or at least deliberate that nobody's worked on it because it's comparatively easy and dull?

(The context awareness of the current breed of generative AI seems to be exactly what TTS always lacks, awkward syllables and emphasis, pronunciation that would be correct sometimes but not after that word, etc.)

risyachka · 2 years ago

Google literally invented transformers that are at the core of all current AI/LLMs so Sundar's comment is very accurate.

dekhn · 2 years ago

Sundar's comments about Google doing AI (really ML) are based more on things that people externally know very little about. Systems like SETI, Sibyl, RePhil, SmartASS. These were all production ML systems that used fairly straightforward and conventional ML combined with innovative distributed computing and large-scale infrastructure to grow Google's product usage significantly over the past 20 years.

For example here's a paper 10 years old now: https://static.googleusercontent.com/media/research.google.c... and another close to 10 years old now: https://research.google/pubs/pub43146/ The learning they expose in those papers came from the previous 10 years of operating SmartASS.

However, SmartASS and sibyl weren't really what external ML people wanted- it was just fairly boring "increase watch time by identifying what videos people wioll click on" and "increase mobile app installs" or "show the ads people are likely to click on".

It really wasn't until vincent vanhoucke stuffed a bunch of GPUs into a desktop and demonstrated scalable and dean/ng built their cat detector NN that google started being really active in deep learning. That was around 2010-2012.

tempnow987 · 2 years ago

But their first efforts in BARD were really not great. I'd just have left the bragging out in terms of how long. OpenAI and others have no doubt sent a big wakeup call to google. For a while it seemed like they had turned to focus an AI "safety" (remembering some big blowups on those teams as well) with papers about how AI might develop negative stereotypes (ie, men commit more violent crime then women?). That seems to have changed - this is very product focused, and I asked it some questions that in many models are screened out for "safety" and it responded which is almost even more surprising (ie. Statistically who commits more violent crime, men or women).

choppaface · 2 years ago

> A better look would simply be to show instead of tell.

Completely! Just tried Bard. No images and the responses it gave me were pretty poor. Today's launch is a weak poor product launch, looks mostly like a push to close out stuff for Perf and before everybody leaves for the rest of the December for vacation.

irthomasthomas · 2 years ago

They played the same tune at that panel with Sam Altman the night before he was fired.

https://youtu.be/ZFFvqRemDv8

He mentions Transformers - fine. Then he says that we've all been using Google AI for so long with Google Translate.

neop1x · 2 years ago

A simple REST API with a static token auth like OpenAI API would help. Previously when I tried Bard API it was refusing to accept token auth, requiring that terrible oauth flow so I gave up.

dist-epoch · 2 years ago

> show instead of tell

They showed AlphaGo, they showed Transformers.

Pretty good track record.

visarga · 2 years ago

That was ages ago. In AI even a week feels like a whole year in other fields. And many/most of those researchers have fled to startups, so those startups also have a right to brag. But not too much - only immediate access to a model beating GPT4 is worth bragging today (cloud), or getting GPT3.5 quality from a model running on a phone (edge).

So it's either free-private-gpt3.5 or cloud-better-than-gpt4v. Nothing else matters now. I think we have reached an extreme point of temporal discounting (https://en.wikipedia.org/wiki/Time_preference).

nothrowaways · 2 years ago

SOTA is made by an ex Google employee. So their argument still holds.

jonplackett · 2 years ago

I find this video really freaky. It’s like Gemini is a baby or very young child and also a massively know it all adult that just can’t help telling how clever it is and showing off its knowledge.

People speak of the uncanny valley in terms of appearance. I am getting this from Gemini. It’s sort of impressive but feels freaky at the same time.

Is it just me?

kromem · 2 years ago

No, there's an odd disconnect between the impressiveness of the multimodal capabilities vs the juvenile tone and insights compared to something like GPT-4 that's very bizarre in application.

It is a great example of what I've been finding a growing concern as we double down on Goodhart's Law with the "beats 30 out of 32 tests compared to existing models."

My guess is those tests are very specific to evaluations of what we've historically imagined AI to be good at vs comprehensive tests of human ability and competencies.

So a broad general pretrained model might actually be great at sounding 'human' but not as good at logic puzzles, so you hit it with extensive fine tuning aimed at improving test scores on logic but no longer target "sounding human" and you end up with a model that is extremely good at what you targeted as measurements but sounds like a creepy toddler.

We really need to stop being so afraid of anthropomorphic evaluation of LLMs. Even if the underlying processes shouldn't be anthropomorphized, the expressed results really should be given the whole point was modeling and predicting anthropomorphic training data.

"Don't sound like a creepy soulless toddler and sound more like a fellow human" is a perfectly appropriate goal for an enterprise scale LLM, and we shouldn't be afraid of openly setting that as a goal.

nolist_policy · 2 years ago

https://www.youtube.com/watch?v=PJgo3BBgWDA

willsmith72 · 2 years ago

they have to try something, otherwise it looks like they've been completely destroyed by a company of 1000 people

jongjong · 2 years ago

Yes it sounds like a conspiracy theory about government and big tech working on advanced tech which has existed for decades but kept secret.

vinniepukh · 2 years ago

No surprises here.

Google DeepMind squandered their lead in AI so much that they now have to have “Google” prepended to their name to show that adults are now in charge.

password54321 · 2 years ago

What an ugly statement. DeepMind has been very open with their research since the beginning because their objective was much more on making breakthroughs with moonshot projects than near term profit.