Readit News logoReadit News
_wire_ · 3 months ago
Google's Gemini in search just makes up something that arbitrarily appears to support the query without care for context and accuracy. Pure confabulation. Try it for yourself. Ridiculous. It works as memory support if you know the result you're looking for, but if you don't, you can't trust it as far as you can throw it.

If you look carefully at Google Veo output, it's similarly full of holes.

It's plain there's no reasoning whatsoever informing the output.

Veo output with goofy wrongness

https://arstechnica.com/ai/2025/05/ai-video-just-took-a-star...

Tesla FSD goes crazy

https://electrek.co/2025/05/23/tesla-full-self-driving-veers...

camillomiller · 3 months ago
This baffles me like no other tech has done before. Google is betting its own core business on a pivot that relies on a massively faulty piece of technology. And as Ben Evans also says, promising that it will get better only gets you so far, it’s an empty promise. Yesterday AI overview made up an entire album by a dead Italian musician when I searched for a tribute event that was happening at a Berlin venue. It just took the name of the venue and claimed it was the most important work from that artist.

Funnily enough (not for Google), I copypasted that answer on chatGPT and it roasted AI Overview so bad on its mistakes and with such sarcasm that it even made me chuckle.

DanHulton · 3 months ago
It's the unfounded promises that this will be solved because the tech will only get better that really upset me. Because sure, it will get better, I'm pretty certain of that. They'll have additional capabilities, they'll have access to more-recent data, etc. But "better" does not necessarily equate to "will fix the lying problem." That's a problem that is BAKED INTO the technology, and requires some kind of different approach to solve -- you can't just keep making a hammer bigger and bigger in the hopes that one day it'll turn into a screwdriver.

Before LLMs really took off, we were in the middle of an "AI winter", where there just weren't any promising techs, at least none with sufficient funding attached to them. And it's WORSE now. LLMs have sucked all the air out of the room, and all of the funding out of other avenues of research. Technologies that were "10-20" years away now might be 30-40, because there's fewer people researching them, with less money, and they might even be completely different people trying to restart the research after the old ones got recruited away to work on LLMs!

justmarc · 3 months ago
And suddenly this type of quality is becoming "normal" and acceptable now? Nobody really complains.

That is very worrying. Normally this would never fly, but nowadays it's kind of OK?

Why should false and or inaccurate results be accepted?

TeMPOraL · 3 months ago
We lost that battle back when we collectively decided that sales and marketing is respectable work.
reaperducer · 3 months ago
And suddenly this type of quality is becoming "normal" and acceptable now?

The notion that "computers are never wrong" has been engrained in society for at least a century now, starting with scifi, and spreading to the rest of culture.

It's an idea that has caused more harm than good.

rchaud · 3 months ago
> Normally this would never fly, but nowadays it's kind of OK?

We started down this path ever since obvious bugs were reframed as "hallucinations".

Nuzzerino · 3 months ago
Complain to it enough times, remain resilient and you’ll eventually figure it out (that’s a wild card though). Or find someone who has and take their word for it (except you can’t because they’re probably indistinguishable from the ‘bot’ now according to the contradictory narrative). Iterate. Spiral. No one should have to go through that though. Be merciful.
meander_water · 3 months ago
I've recently started wondering what the long term impacts of AI slop is going to be. Will people get so sick of the sub-par quality that there will be a widespread backlash, and a renewed focus on handmade or artisinal products made by hand? Or will we go the other way where everyone will accept the status-quo and everything will just get shittier, and we will just have multiple cycles of AI slop trained on AI slop?
chronid · 3 months ago
Suddenly? That's the level of quality that is standard in all software projects I've ever seen since I've started working in IT.

Enshittification is all around us and is unstoppable. Because we have deadlines to hit and goals to shows we reached to the VP. We broke everything and the software is just half working? Come on that's an issue for the support and ops teams. On to the next beautiful feature we can put on marketing slides!

krapp · 3 months ago
>Why should false and or inaccurate results be accepted?

The typical response is "because humans are just as bad, if not worse."

veunes · 3 months ago
And how quickly the bar is being lowered
emrah · 3 months ago
When were search results 100% fact checked and accurate??
XorNot · 3 months ago
I use ublock to remove Gemini responses from search, because even glancing at then is liable to bias my assumption about whatever I'm looking for.

Information hygiene is a skill which started out important but is going to become absolutely critical.

MaxikCZ · 3 months ago
Half my browser extensions have sole purpose of removing shit from sites I visit.

HN is like a unicorn that havent made me block a single thing yet.

justmarc · 3 months ago
We can't expect the vast majority of regular users to have any of that skill.

What is this going to lead to? fascinating times.

flomo · 3 months ago
I had a question about my car, so I googled '[year] [make] [model] [feature]'. This seems like the sort of thing that Google had always absolutely nailed. But now, 90% of the page was ai slop about wrong model, wrong year, even the wrong make. (There was one youtube which was sorta informative, so some credit.)

But way way down on at the very the bottom of the page, there was the classic google search answer on a totally unrelated car forum. Thanks CamaroZ28.com!

camillomiller · 3 months ago
This is a very very good point. If this was happening with different queries that we never used, or a new type of questions/queries than I would have some patience. But it happens exactly with the formulations that were giving you the best results in the SERP before!
gambiting · 3 months ago
I'm a member of a few car groups on Facebook and the misinformation coming from Google is infuriating, because people treat it as gospel and then you have to explain to them that the AI slop they were shown as the top result in Google is not - in fact - correct.

As a simple example - someone googled "how to reset sensus system in Volvo xc60" and Google told them to hold the button under the infotainment screen for 20 seconds and they came to the group confused why it doesn't work. And it doesn't work because that's not the way to do it, but Google told them so, so of course it must be true.

dingnuts · 3 months ago
Ad supported search has been awful for a few years now, just buy a Kagi subscription and you'll be like me: horrified but mildly amused with a dash of "oh that explains a lot" when people complain about Google

Deleted Comment

TeMPOraL · 3 months ago
That was true before AI too (I know, I did such searches myself). Google results has been drowning in slop for over a decade now - it was just human-generated slop, aka. content marketing and SEO stuff.

I'm not defending the AI feature here, just trying to frame the problem: the lies and hallucinations were already there, but nobody cared because apparently people don't mind being constantly lied to by other people.

Nursie · 3 months ago
Yep, I was looking up a hint for the Blue Prince game the other day for the (spoiler alert?) casino room.

Google’s AI results proceeded to tell me all about the games available at the Blue Prince Casino down the road from here, where I know for a fact there’s only a prison, a Costco, a few rural properties and a whole lot of fuck-all.

It’s amazing to watch it fill in absolute false, fabricated tripe at the top of their search page. It also frequently returns bad information on subjects like employment law and whatever else I look up.

It would be hilarious if people weren’t actually relying on it.

datavirtue · 3 months ago
I the have had a lot of luck with copilot conversations to research stocks and trading strategies. I am always skeptical of results and verify everything with various sources but it does help me find/get on the right track.
veunes · 3 months ago
Yeah, it feels like we've crossed into a weird uncanny valley where AI outputs sound smarter than ever, but the underlying logic (or lack thereof) hasn't caught up
roywiggins · 3 months ago
I think it's just much easier for an LLM to learn how to be convincing than it is to actually be accurate. It just has to convince RLHF trainers that it's right, not actually be right. And the first one is a general skill that can be learned and applied to anything.

https://arxiv.org/html/2409.12822v1

MangoToupe · 3 months ago
I'm honestly so confused how people use LLMs as a replacement for search. All chatbots can ever find are data tangential to the stuff I want (eg i ask for a source, it gives me a quote). Maybe i was just holding search wrong?
TeMPOraL · 3 months ago
> eg i ask for a source, it gives me a quote

It should give you both - the quote should be attributed to where it was found. That's, generally, what people mean when they ask or search for "a source" of some claim.

As for general point - using LLMs as "better search" doesn't really look like those Google quick AI answers. It looks like what Perplexity does, or what o3 in ChatGPT does when asked a question or given a problem to solve. I recommend checking out the latter; it's not perfect, but good enough to be my default for nontrivial searches, and more importantly, it shows how "LLMs for search" should work to be useful.

mdp2021 · 3 months ago
> LLMs as a replacement for search

Some people expect LLMs as part of a better "search".

LLMs should be integrated to search, as a natural application: search results can heavily depend on happy phrasing, search engines work through sparse keywords, and LLMs allow to use structured natural language (not "foo bar baz" but "Which foo did a bar baz?" - which should be resistant to terms variation and exclude different semantics related to those otherwise sparse terms).

But it has to be done properly - understand the question, find material, verify the material, produce a draft reply, verify the draft vis-a-vis the material, maybe iterate...

jazzyjackson · 3 months ago
Some chatbots plan a query and summarize what a search returns instead of trying to produce an answer on their own; I use perplexity a lot which always performs a search, I think ChatGPT et al have some kind of classifier to decide if web search is necessary. I especially use it when I want a suggestion without sifting through pages of top ten affiliate listicles (why is there a list of top 10 microwaves? I only need one microwave!)
MaxikCZ · 3 months ago
Its good to be shown direction. When I only have a vauge idea of what I want, AI usually helps me frame it into searchable terms I had no clue existed.
incangold · 3 months ago
I find LLMs are often better for X vs Y questions where search results were already choked by content farm chaff. Or at least LLMs present more concise answers, surrounded by fewer ads and less padding. Still have to double check the claims of course.
Garlef · 3 months ago
Maybe that's because we're conditioned by the UX of search.

But another thing I find even more surprising is that, at least initially, many expected that the LLMs would give them access to some form of higher truth.

christophilus · 3 months ago
I’ve had good results with Brave search, which self reports to use: Meta Llama 3, Mistral / Mixtral, and CodeLLM. It’s not always 100% accurate, but it’s almost always done the trick and saved me digging through more docs than necessary.
ImPostingOnHN · 3 months ago
gemini is the worst LLM I've used, whether directly or through search. As in your experience, it regularly makes stuff up, like language/application features, or command flags (including regarding google products), and provides helpful references to sources which do not say what is cited from them.

in my case, it does so roughly half the time, which is the worst proportion, because that means I can't even slightly rely upon the truth being the opposite of the output.

JimDabell · 3 months ago
Gemini was underwhelming until 2.5 Pro came along, which is very good. But in my experience all of the Google models are far worse than everything else when it comes to hallucination.
Kwpolska · 3 months ago
Google recently started showing me their AI bullshit. This made me pull the trigger and switch to DuckDuckGo as the primary search engine.

That said, some niche stuff has significantly better results in Google. But not in the AI bullshit. I searched for a very niche train-related, the bullshit response said condescendingly "this word does not exist, maybe you meant [similarly sounding but completely different word], which in the context of trains means ...". The first real result? Turns out that word does exist.

datavirtue · 3 months ago
I switched to DDG over seven years ago and just realized it had been that long when I read your comment. Google started wasting my time and I had to shift.
dijksterhuis · 3 months ago
fyi, you can remove any and all “ai” assistant bs etc from DDG if you use the noai subdomain (in case you wanna avoid their stuff, although it’s much less prominent anyway) https://noai.duckduckgo.com/
christophilus · 3 months ago
What’s the word?
sspiff · 3 months ago
I find this phenomenon really frustrating. I understand (or am at least aware of) the probabilistic nature of LLMs and their limitations, but when I point this out to my wife or friends when they are misusing LLMs for tasks they are both unsuited for and unreliable at, they wave their hands and dismiss my concerns as me being an AI cynic.

They continue to use AI for math (asking LLMs to split bills, for example) and treat its responses for factual data lookup as 100% reliable and correct.

osmsucks · 3 months ago
> They continue to use AI for math (asking LLMs to split bills, for example)

Ah, yes, high tech solutions for low tech problems. Let's use the word machine for this number problem!

sspiff · 3 months ago
Many of them try to be mindful of their climate impact.

I've tried to explain it in those terms as well: every medium-sized prompt on these large models consumes roughly one phone battery charge worth of energy. You have a phone with a calculator.

I'd ask them to do the math on how much energy they're wasting asking stupid things of these systems, but I'm too afraid they'd ask ChatGPT to do the math.

thaumasiotes · 3 months ago
> Let's use the word machine for this number problem!

You know, that's a thought process that makes internal sense.

You have someone who's terrible at math. They want something else to do math for them.

Will they prefer to use a calculator, or a natural language interface?

How do you use a calculator without knowing what you're doing?

datavirtue · 3 months ago
I'm so lazy, I have chat bots do all kinds of complex calculations for me. I even use it as a stock screener and the poor thing just suffers, burning fuck tons of electricity.
veunes · 3 months ago
What's tricky is that for casual use, it gets things "close enough" often enough that people start building habits around it
jatora · 3 months ago
Using it for simple math is actually pretty hilarious. Hey maybe they make sure to have it use python!...but I dream
BlueTemplar · 3 months ago
Using LLMs (or platforms in general) is a bit like smoking (in closed spaces, with others present) : a nuisance.
diggan · 3 months ago
That's just plain wrong, and I'm a smoker. LLMs won't affect the ones around you, unless you engage with them in some way. Sit next to me while I smoke and you'll be affected by passive smoking regardless of how much you engage or not. Not really a accurate comparison :)
JeremyNT · 3 months ago
> They continue to use AI for math (asking LLMs to split bills, for example) and treat its responses for factual data lookup as 100% reliable and correct.

I don't do this but isn't it basically... fine? I assume all the major chatbots can do this correctly at this point.

The trick here is that chatbots can do a wide range of tasks, so why context switch to a whole different app for something like this? I believe you'll find this happening more frequently for other use cases as well.

Usability trumps all.

JeremyNT · 3 months ago
Wish I could edit, but I was referring to the bill splitting math specifically here. I didn't mean to quote the rest.

When it comes to facts that actually matter, people need to know to verify the output.

minimaxir · 3 months ago
The simple "AI responses may include mistakes" disclaimer or ChatGPT's "ChatGPT can make mistakes. Check important info." CYA text at the bottom of the UI are clearly no longer sufficient. After years of news stories about LLM hallucinations in fact-specific domains and people still getting burnt by them, LLM providers should be more aggressive in educating users about their fallability since hallucinations can't ever be fully fixed, even if it means adding friction.
eddythompson80 · 3 months ago
That doesn't really make sense. You either make the LLM provider liable for the output of the model, or you have the current model. The friction already exists. All these AI companies and cloud providers are running "censored models" and more censorship is added at every layer. What would more friction be here? more pop-ups?

Doing the former basically means killing the model-hosting business. Companies could develop models, use them internally and give them to their employees, but no public APIs exists. Companies strike legally binding contracts to use/license each other models, but the general public doesn't have access to those without something that would mitigate the legal risk.

Maybe years down the line, as attitudes soften, some companies would begin to push the boundaries. Automating the legal approval process, opening signups, etc.

minimaxir · 3 months ago
Yes, more popups, retention metrics be damned. Even 2 years since ChatGPT, many people still think it's omniscent which is what's causing trouble.
camillomiller · 3 months ago
Remember when Apple was roasted to hell anytime Maps would push you to get a wrong turn? Or when Google Maps would take you to the wrong place at the wrong time (like a sketchy neighborhood)? Those were all news stories they had to do PR crisis management for. Now they slap a disclaimer like that and we’re all good to go. The amount of public opinion forgiveness these technologies are granted is disproportionate and disheartening.
thejohnconway · 3 months ago
That always struck me as pretty overblown, given that before map apps, people got lost all the goddam time. It was a rare trip with any complexity that a human map reader wouldn’t make a mistake or two.

LLMs aren’t competing with perfect, they are competing with websites that may or may not be full of errors, or asking someone that may or may not know what they are talking about.

arcanemachiner · 3 months ago
Yeah, but we're all used to having software integrated into our lives now. And we all now how shitty and broken software often is...
ben_w · 3 months ago
Apple maps currently insists that there's a hotel and restaurant across the street from me.

According to the address on the business website that Apple Maps itself links to, the business is 432 km away from me.

tbrownaw · 3 months ago
> should be more aggressive in educating users about their fallability

This might be an "experience is the best teacher" situation. It'd probably be pretty hard to invent a disclaimer that'd be as effective as getting bit.

minimaxir · 3 months ago
Unfortunately, getting bit in cases such as publishing misinformation or false legal citations waste everyone time, not just their own.
nyarlathotep_ · 3 months ago
> LLM providers should be more aggressive in educating users about their fallability since hallucinations can't ever be fully fixed, even if it means adding friction

But they can't be as the whole premise of the boom is replacing human intellectual labor. They've said as much on many many occasions--see Anthropic's CEO going off about mass unemployment quite recently. How can the two of these co-exist?

userbinator · 3 months ago
The disclaimer needs to be in bold red text at the top.
neepi · 3 months ago
To be fair people are pretty damn unintelligent when it comes to verifying information. Despite my academic background I catch myself doing it all the time as well.

However LLMs amplify this damage by sounding authoritative on everything and even worse being promoted as authoritative problem solvers for all domains with a small disclaimer. This double think is unacceptable.

But if they made the disclaimer bigger then the AI market would collapse in about an hour much like people’s usage does when they don’t verify something and get shot by someone actually authoritative. This has happened at work a couple of times and caused some fairly high profile problems. Many people refuse to use it now.

What we have is bullshit generator propped up by avoiding speaking the truth because the truth compromises the promoted utility. Classic bubble.

YetAnotherNick · 3 months ago
You are assuming that the people burnt by LLM responses doesn't know that ChatGPT can make mistakes?
mdp2021 · 3 months ago
> The simple ...

No, improper phrasing. Correct disclaimer is, "The below engine is structurally unreliable".

--

Comment, snipers. We cannot reply to unclear noise.

jll29 · 3 months ago
Language models are not designed to know things, they are designed to say things - that's why they are called language models and not knowledge models.

Given a bunch of words have already been generated, it always ads the next words based on how common the sequence is.

The reason you get different answers each time is the effect of the pseudo-random number generator on picking the next word. The model looks at the probability distribution of most likely next words, and when the configuration parameter called "temperature" is 0 (and it is actually not possible to set to 0 in the GUI), there is no random influence, and strictly the most likely next word (top-1 MLE) will always be chosen. This leads to output that we would classify as "very boring".

So the model knows nothing about IBM, PS/2, 80286 versus 80486, CPUs, 280 or any models per se. -- One of the answers seems to suggest that there is no model 280, I wonder whether that one was generated through another process (there is a way to incorporate user feedback via "reinforcement learning"), or whether that was a consequence of the same randomized next-word picking, just a more lucky attempt.

otabdeveloper4 · 3 months ago
> This leads to output that we would classify as "very boring".

Not really. I set temperature to 0 for my local models, it works fine.

The reason why the cloud UIs don't allow a temperature of 0 is because then models sometimes start to do infinite loops of tokens, and that would break the suspension of disbelief if the public saw it.

mdp2021 · 3 months ago
Which local models are you using, that do not output loop garbage at temperature 0?

What do you get at very low temperature values instead of 0?

verisimi · 3 months ago
> Language models are not designed to know things, they are designed to say things - that's why they are called language models and not knowledge models.

This is true. But you go to Google not to 'have a chat' but ostensibly to learn something based in knowledge.

Google seem to be making an error in swapping the provision of 'knowledge' for 'words' you'd think, but then again perhaps it makes no difference when it comes to advertising dollars which is their actual business.

neilv · 3 months ago
On the Google search Web site, the "AI responses may include mistakes." weak disclaimer small print is also hidden behind the "Show more" button.

When OpenAI launched ChatGPT, I had to explain to a non-CS professor that it wasn't AI like they're thinking of, but currently more like a computational parlor trick that looks a lot like AI.

But turns out this parlor trick is awesome for cheating on homework.

Also good at cheating at many other kinds of work, if you don't care much about quality, nor about copyrights.

stavros · 3 months ago
I really don't understand the view that it's a "parlor trick that looks like AI". If it's not "a thing that can write code", but instead just looks like a thing that can write code (but can actually write code), it can write code. All the "no true Scotsman" stuff about what it's doing behind the scenes is irrelevant, because we have no idea what human brains are doing behind the scenes either.
ben_w · 3 months ago
Although I broadly agree, I wouldn't go quite as far as where you say:

> All the "no true Scotsman" stuff about what it's doing behind the scenes is irrelevant, because we have no idea what human brains are doing behind the scenes either.

Computers and transistors have a massive speed advantage over biological brains and synapses — literally, not metaphorically, the same ratio as the speed difference between how far you walk in a day and continental drift, with your brain being continental draft — which means they have the possibility of reading the entire Internet in a few weeks to months to learn what they know, and not the few tens to hundreds of millenia it would take a human.

Unfortunately, the method by which they acquire information and knowledge, is sufficiently inefficient that they actually need to read the entire Internet to reach the skill level of someone who has only just graduated.

This means I'm quite happy to *simultaneously* call them extremely useful, even "artificial general intelligence", and yet also agree with anyone who calls them "very very stupid".

If we actually knew how our brains did this inteligence thing, we could probably make AI genuinely smart as well as absurdly fast.

hnlmorg · 3 months ago
Their point wasn’t that it’s not useful. It’s that it isn’t artificial intelligence like the masses consider the term.

You wouldn’t say Intellisense isn’t useful but you also wouldn’t call it “AI”. And what LLMs are like is basically Intellisense on steroids (probably more like a cocktail of speed and acid, but you get my point)

keiferski · 3 months ago
It matters if we are making a distinction between essence and output.

On the output side, it functionally doesn’t really have a difference. At least in terms of more abstract things like writing code. Although I would argue that the output AI makes still doesn’t match the complexity and nuance of an individual human being, though, and may never do so, simply because the AI is simulating embodiment and existing in the world. It might need to simulate an Earth equivalent to truly simulate a human’s personal output.

In the essence side, it’s much more of a clear distinction. We have numerous ways of determining if a thing is human or not - biology, for one. It would take some serious sci-fi until we get to the point where an android is indistinguishable from a human on the cellular level.

neilv · 3 months ago
Historically, there's been some discussion about that:

https://en.wikipedia.org/wiki/Chinese_room

otabdeveloper4 · 3 months ago
LLMs can't write code.

They don't have capacity to understand logical or temporal relationships, which is the core competency of coding.

They can form syntactically valid strings in a formal language, which isn't the same thing as coding.

loa_in_ · 3 months ago
It's a memory augmentation/information retrieval tool with flexible input and output interface.
9x39 · 3 months ago
Gemini appears tuned to try to handle the typical questions people type in, while more traditional things you search for get some confabulated nonsense.

I've observed a great deal of people trust the AI Overview as an oracle. IMO, it's how 'normal' people interact with AI if they aren't direct LLM users. It's not even age gated like trusting the news - trusting AI outputs seems to cross most demographics. We love our confident-based-on-nothing computer answers as a species, I think.

eddythompson80 · 3 months ago
I think Google is in a particularly bad situation here.

For over a decade now, that spot in the search page had the "excerpt from a page" UI, which made a lot of sense. It cut down an extra click, and if you trusted the source site, and presumably Google's "Excerpt Extraction Technology" (whatever that was) what was left not to trust? It was very trust worthy information location.

Like if I search for a quick medical question, and there is an except from the mayoclinic, I trust the mayoclinic, so good enough for me. Sometimes I'd copy the excerpt from google, and go to the page and ctrl-f it.

Google used to do a decent job at picking reputable sources, the excerpts were always indeed found in the page in a non-altering context, so it was good enough to build trust. Now that system has degraded over the years in terms of how good it was at picking those reputable sources. Most likely because it was SEO gamed.

However, it has been replaced with a the AI Overview. I'm not against AI, but AI is fundamental different than "a relevant excerpt from a source you trust with a verifiable source in milliseconds".

tsunamifury · 3 months ago
How could you think this hard and be so far off. Google is in a hyper strong position here and I don’t even like them.

They can refine grounded results over time and begin serving up increasingly well reasoned results over time as Models improve cost effectively. Then that drives better vectors for ads.

Like what about this is hard to understand?

geraneum · 3 months ago
> if they aren't direct LLM users

My manager, a direct LLM user, uses the latest models to confirm his assumptions. If they are not confirmed on the first try, he then proceeds to form the question differently until gets what he wants from them.

edit: typo

danielbln · 3 months ago
We love our confident-based-on-nothing answers period, computer or not.
chneu · 3 months ago
Most folks just want confirmation. They don't want to have their views/opinions changed. LLM are good at trying to give folks what they're looking for.
mdp2021 · 3 months ago
Repent.

You are not there to "love what gives you the kicks". That's a kind of love that should not exit the bedroom (better, the bathroom).

Llamamoe · 3 months ago
I already went through a realization a while ago that you just can't mention something to people anymore and expect them to be able to learn about it by searching the web, like it used to be possible, because everything is just unreliable misleading SEO spam slop.

I shudder to think how much worse this is going to be with "AI Overview". Are we entering an era of people googling "how does a printer work" and (possibly) being told that it's built by a system of pulleys and ropes and just trusting it blindly?

Because that's the kind of magnitude of errors I've seen in dozens of searches I've made in the domains I'm interested in, and I think everyone has seen the screenshots of even more outlandish - or outright dangerous - answers.

hannob · 3 months ago
"AI Responses May Include Mistakes" is really the one, single most important thing I want to shout into the whole AI debate.

It also should be the central issue - together with the energy/climate impacts - in every debate about AI ethics or AI safety. It's those two things that will harm us most if this hype continues unchecked.

consp · 3 months ago
The problem is not it may, but it will make mistakes. But people do not realize this and treat it as an almighty oracle. It's a statistical model after all, there is a non zero chance of the monkey creating the works Shakespeare.
rcarmo · 3 months ago
This is why Google has got search fundamentally wrong. They just don’t care about accuracy of results anymore, and worry mostly about providing a quick answer and a bunch of sponsored links below it.
Llamamoe · 3 months ago
Except that out of 10 answers, the "quick answer" is subtly wrong 6 times, egregiously wrong 2, and outright dangerous once. I've seen screenshots of stuff that would get people killed or in legal trouble.
dandanua · 3 months ago
They just continue the Eric Schmidt idea "More results are better than none". It has evolved to "It's better to hallucinate than produce a negative answer", I guess.