ChatGPT Search - Readit News

Been thinking about this a lot [1]. Will this fundamentally change how people find and access information? How do you create an experience so compelling that it replaces the current paradigm?

The future promised in Star Trek and even Apple's Knowledge Navigator [2] from 1987 still feels distant. In those visions, users simply asked questions and received reliable answers - nobody had to fact-check the answers ever.

Combining two broken systems - compromised search engines and unreliable LLMs - seems unlikely to yield that vision. Legacy, ad-based search, has devolved into a wasteland of misaligned incentives, conflict of interest and prolifirated the web full of content farms optimized for ads and algos instead of humans.

Path forward requires solving the core challenge: actually surfacing the content people want to see, not what intermiediaries want them to see - which means a different business model in seach, where there are no intermediaries. I do not see a way around this. Advancing models without advancing search is like having a michelin star chef work with spoiled ingredients.

I am cautiously optimistic we will eventually get there, but boy, we will need a fundamentally different setup in terms of incentives involved in information consumption, both in tech and society.

[1] https://blog.kagi.com/age-pagerank-over

[2] https://www.youtube.com/watch?v=umJsITGzXd0

marcus_holmes · 10 months ago

I agree that this is the core question, but I'd put it as: Who Gets To Decide What Is True?

With a search paradigm this wasn't an issue as much, because the answers were presented as "here's a bunch of websites that appear to deal with the question you asked". It was then up to the reader to decide which of those sites they wanted to visit, and therefore which viewpoints they got to see.

With an LLM answering the question, this is critical.

To paraphrase a recent conversation I had with a friend: "in the USA, can illegal immigrants vote?" has a single truthful answer ("no" obviously). But there are many places around the web saying other things (which is why my friend was confused). An LLM trawling the web could very conceivably come up with a non-truthful answer.

This is possibly a bad example, because the truth is very clearly written down by the government, based on exact laws. It just happened to be a recent example that I encountered of how the internet leads people astray.

A better example might be "is dietary saturated fat a major factor for heart disease in Western countries?". The current government publications (which answer "yes") for this are probably wrong based on recent research. The government cannot be relied upon as a source of truth for this.

And, generally, allowing the government to decide what is true is probably a path we (as a civilisation) do not want to take. We're seeing how that pans out in Australia and it's not good.

dataflow · 10 months ago

> To paraphrase a recent conversation I had with a friend: "in the USA, can illegal immigrants vote?" has a single truthful answer ("no" obviously)

Er, no, the meaning of the question is ambiguous, so I'm not sure "has a single truthful answer" is accurate. What does "can" mean? If you mean "permitted", then no. But if you mean can they vote anyway and get away with it? It's clearly happened before (as rare as it might have been), so technically the answer to that would be be yes.

the__alchemist · 10 months ago

Neal Stephenson's book _Fall, or; Dodge in Hell_, from 2019, dedicates many words to this concept. Briefly summarized, it explores a post-truth-world, describing the world where people could agree on the truth as a narrow time-slice in history. People have their own individual internet filters, and the USA becomes divided into Afghanistan-etc-like tribes, each an echo chamber. (Ameristan)

lazerlapin · 10 months ago

> A better example might be "is dietary saturated fat a major factor for heart disease in Western countries?". The current government publications (which answer "yes") for this are probably wrong based on recent research. The government cannot be relied upon as a source of truth for this.

I know it was just an example, but actually no, the role of dietary saturated fat as a factor for heart disease remains very much valid. I’m not sure which recent studies you’re referring to, but you can't undo over 50 years of research on the subject so easily. What study were you thinking about?

gonzobonzo · 10 months ago

> To paraphrase a recent conversation I had with a friend: "in the USA, can illegal immigrants vote?" has a single truthful answer ("no" obviously). But there are many places around the web saying other things (which is why my friend was confused). An LLM trawling the web could very conceivably come up with a non-truthful answer.

There are many jurisdictions where some illegal immigrants (Dreamers) are allowed to vote, including New York City[1].

[1] https://www.theguardian.com/us-news/2022/jan/09/new-york-all...

parpfish · 10 months ago

I was thinking something similar. I like the google auto-ai summary for little trivia facts and objective things (who played so and so in the movie? How heavy is a gallon of milk?) this is stuff that is verifiable and could theoretically just be queried from some sort of knowledge base.

But for anything remotely subjective, context dependent, or time sensitive I need to know the source. And this isn’t just for hot button political stuff — there’s all sorts of questions like “is the weather good this weekend?”, “Are puffy jackets cool again?”, “How much should I spend on a vacation?”

shmel · 10 months ago

What is truth anyway? I see it as a quicker version of browsing to web to get a summary of what people say. As you said, with search you get a bunch of websites where strangers talk about a certain topic. You read a dozen and see if they agree with each other or if they sound legit.. There is just a huge overlap between what we consider true and what (an overwhelming majority of) people agree on. A lot of things are reduced to the consensus. If you ask a non-obvious question, you usually get an answer "a lot of people you consider trustworthy dedicated some time studying this question and agreed that the answer is X". But then people can be wrong, a big group of people can be wrong, people can be bribed or perhaps you don't actually trust these people that much. The internet can't tell you the truth. LLM can't tell you the truth. But they can summarize what other people in the world say on this subject.

HarHarVeryFunny · 10 months ago

> I agree that this is the core question, but I'd put it as: Who Gets To Decide What Is True?

Well, outside of matters of stark factuality (what time does the library close?, what did MSFT close at?), many things people may be "searching for" (i.e. trying to find information about) are more in the realm of informed opinion and summary where there is no right or wrong, just a bunch of viewpoints, some probably better informed than others.

I think this is the value of traditional search where the result is a link to a specific page/source whose trustworthiness or degree of authority you can then judge (e.g. based on known reputation).

For AI generated/summarized "search results", such as those from "ChatGPT Search" (awkward name - bit like a spork), the trustworthiness or degree of authority is only as good as the AI (not person) that generated it, and given today's state of technology where the "AI" is just an LLM (prone to hallucination, limited reasoning, etc), this is obviously a bit of an issue...

Even in the future, when presumably human level AGI will have made LLMs obsolete, I think it'll still be useful to differentiate search from RAG AGI search/chat, since it'll still be useful to know the source. At that point the specific AGI might be regarded as a specific person, with it's own areas of expertise and biases.

The name "ChatGPT Search" is very awkward - presumably they are trying to position this as a potential Google competitor and revenue generator (when the inevitable advertisements come), but at the end of the day it's just RAG.

FuriouslyAdrift · 10 months ago

Non-citizens cannot vote in FEDERAL elections, but some states allow them to vote in LOCAL elections.

There are few more restrictions here: https://www.usa.gov/who-can-vote

senko · 10 months ago

This nails it:

> Who Gets To Decide What Is True?

I would not agree to this however:

> It was then up to the reader to decide which of those sites they wanted to visit,

With current search engines, Google decides for you ("helped" by SEO experts failing over themselves to rank higher because their revenue directly depends on it). In theory, you could go an read a few dozen pages and decide for yourself.

In reality, non-technical users will click on a first link that seems to be related to the question, and that's it.

Even with AI-based search (or q/a instead of search) I think the same will happen. There is and will be a huge reward for gaming the results, be they page links or RAG snippets that rank for a query. I've already seen many SEO shops advertising their strategies to keep the customers' business relevant in the chatbot area. As this approach becomes more prevalent, you can be sure many smart people will do many experiments to figure out how best to please the new algorithm.

In other words, AI-based search is an UX optimization, but doesn't address the core problem of how do you decide what content's the best, and do that in context of each user, and do that while maximizing the benefit for the user vs profit for the company doing this.

So we have two huge hurdles:

1. who will decide what the user wants[0] to see, and how are incentives for that entity aligned with the user's

2. how is that entity supposed to find the information needle in the haystack of slop that's 90%+ of current web?

[0] "wants" in a rational "give me the best possible information" meaning, not in "what keeps them addicted, their heart rate up, and what will drive engagement" meaning

webninja · 10 months ago

This doesn’t have a single truthful answer. Some states don’t have voter ID laws, so the truth can depend on the state. In those no voter ID laws there’s not that much keeping someone from voting twice or more under different names, except significant moral qualms about subverting instead of preserving everyone else’s right to vote in a democratic republic. Someone can assume the name of a person from another country that could’ve plausibly come in illegally. Without a picture ID, they can’t claim you aren’t that persons.

Can an illegal immigrant vote? Yes, in states without voter ID laws, technically anyone can vote, even convicted terrorists. Should an illegal immigrant vote? No, they’re not supposed to be able to vote and there may be consequences if caught.

What purpose do a lack of voter ID laws serve except the obvious conclusion which is to enable cheating?

adamhp · 10 months ago

> With a search paradigm this wasn't an issue as much, because the answers were presented as "here's a bunch of websites that appear to deal with the question you asked". It was then up to the reader to decide which of those sites they wanted to visit, and therefore which viewpoints they got to see.

It is very similar. Google decides what to present to you on the front page. I'm sure there are metrics on how few people get past the front page. Heck, isn't this just Google Search's business model? Determining what you see (i.e. what is "true") via ads?

In much the same way that the Councils of Carthage chose to omit the acts of Paul and Thecla in the New Testament, all modern technology providers have some say in what is presented to the global information network, more or less manipulating what we all perceive to be true.

Recently advancements have just made this problem much more apparent to us. But look at history and see how few women priests there are in various Christian churches and you'll notice even a small omission can have broad impacts to society.

williamcotton · 10 months ago

"Are all bachelors married?" vs "is it raining outside?"

And then Quine points out that the definitions of "bachelor" and "married" are themselves contingent on outside factors.

"Can illegal immigrants vote?", while close to being an analytic proposition, still depends on an empirical approach that can never be mediated by text, video, etc. All propositions are by necessity experiential. Nullius in verba!

So the truth is and has always been what happens when you get off your butt and go out and test the world.

This is not to say that we don't benefit from language. It makes for a great recipe. If you follow the instructions to bake a cake and you get what you expected you know that the recipe was true. The same goes for the laws of science, search engine results, and generative AI.

krageon · 10 months ago

The truth isn't a decision that's made, it just is. So nobody gets to decide it, it just is :)

afh1 · 10 months ago

Unfortunately most people seem comfortable with the idea of a Ministry of Truth, at least here in Brazil and from what I read on the web, in many parts of the world also.

jacobr1 · 10 months ago

I think this presents an approach: move closer to the search engine model. In both academic writing an news journalism, factual claims try to cite their sources. And you can see this is an approach that some of the chat systems have directionally gone. We don't want to always pedantically cite a reference for every question. But maybe we should? At least at an architectural level and let the UX decide if it should be displayed.

> in the USA, can illegal immigrants vote According Author X in Book Y which studied this topic in depth: foo answer

stefek99 · 10 months ago

> "Who Gets To Decide What Is True?"

Independent Tribunal: https://www.independenttribunal.org/ (a project of mine)

Even in the of law there are various schenanigans and loopholes such as "legally true" :)

charlieyu1 · 10 months ago

> Who Gets To Decide What is True?

Outsourced workers in less expensive places of the world providing human feedback.

BolexNOLA · 10 months ago

I don’t know who should get to decide “what is true” but I think we all agree that we won’t get anything remotely akin to truth so long as search is driven by for-profit interests trying to steer us towards their products first and our actual query (a distant) second.

jstummbillig · 10 months ago

> With a search paradigm this wasn't an issue as much, because the answers were presented as "here's a bunch of websites that appear to deal with the question you asked

I don't think this appropriately credits Google's power with regards to what you are seeing

grobbyy · 10 months ago

For the record, in my local elections, we have plenty of people voting illegally.

"Is it legal" is very different from "Can they."

By "local," I mean municipal and below. I didn't mean "federal elections conducted in my locality." Election security kicks for state, federal, and some municipal elections.

Others are intentionally fraudulent (e.g. local corruption) or unintentionally broken (e.g. using Google Forms for a school-level public body, where people not legally qualified to vote might still do it, unaware they're committing a felony).

And "public body" has a specific meaning under my state law which extends the same laws as e.g. cutting for my state senate. That's bodies like local school boards, but not random school clubs.

That's the level where we have massive illegal voting where I live.

wpietri · 10 months ago

You're right, this is a key question:

> Who Gets To Decide What Is True?

For any given statement, the answer up until a couple years ago was, "the speaker". Speakers get to decide what to say, but they're also responsible for what they say. But now with LLMs we have plausible text without a speaker.

I think we have a number of historical models for that. A relevant one is divination. If you bring your question to the haruspex, your answer is read out of the guts of a sacrificed animal. If the answer is wrong, who do you blame? The traditional answer is the gods, or perhaps nobody.

Bu we know now that fortune tellers are just selling answers while pretending to not be responsible for them. Which points us at one solution: anybody selling or presenting LLM output as meaningful is legally responsible for the quality of the product.

Unfortunately, another model is the modern corporation. Sometimes the people in a company intentionally lie. More often, statements are made by one person based on a vision or optimism or confusion or bullshit. Nobody set out to lie, but nobody really cared about the truth, at least not as much as everybody cared about making money.

So I'd agree that the government doesn't have much role in deciding The Truth. Similarly, the government shouldn't have much role in controlling what you eat. But in both cases, I think there's plenty of role for the government in ensuring that companies selling good food or good information have sound production and quality control measures to ensure that they are delivering what consumers are expecting.

yonaguska · 10 months ago

> "in the USA, can illegal immigrants vote?" has a single truthful answer ("no" obviously)

legally no, practically- yes. In most states, you simply must attest that you are a citizen in order to register. In many states, non-citizens have been auto-registered to vote when attaining drivers licenses. Reddit is full of panicked immigrants concerned that they found themselves registered to vote, and worried about how that would affect their status.

panarky · 10 months ago

> An LLM trawling the web could very conceivably come up with a non-truthful answer.

LLMs don't uncritically "trawl" the web, ingesting and then blindly regurgitating what they find.

98% of the internet is crap, yet LLM results don't reflect this dismal figure. They're amazingly good at distilling the 2% that is non-crap.

belter · 10 months ago

> Who Gets To Decide What Is True?

Seems particularly to be a US based phenomenon. Unlike the more transparent manipulation seen under dictatorships... Where people generally recognize propaganda for what it is, even if they can’t openly challenge it—some in USA live within entirely new realities.

judahmeek · 10 months ago

> We're seeing how that pans out in Australia and it's not good.

How are we seeing how that pans out when Australia's misinformation bill is still just a proposal?

freedomben · 10 months ago

I genuinely think Kagi has led the way on this one. Simplicity is beautiful and effective, and Kagi has (IMHO) absolutely nailed it with their AI approach. It's one of those things that in hindsight seems obvious, which is a pretty good measure of how good an idea is IMHO.

Google could have done it and kind of tried, although they're AI sucks too much. I'm very surprised that OpenAI hasn't done this sooner as well. They're initial implementation of web search was sad. I don't mean to be super critical as I think generally OpenAI is very, very good at what they do, but they're initial browse the web was a giant hack that I would expect from an intern who isn't being given good guidance by their mentors.

Once mainstream engines start getting on par with Kagi, there's gonna be a massive wave of destruction and opportunity. I'm guessing there will be a lot of new pay walls popping up, and lots of access deals with the search engines. This will even further raise the barrier of entry for new search entrants, and will further fragment information access between the haves and have-nots.

I'm also cautiously optimistic though. We'll get there, but it's gonna be a bit shakey for a minute or two.

hn_throwaway_99 · 10 months ago

> I'm also cautiously optimistic though. We'll get there, but it's gonna be a bit shakey for a minute or two.

But I don't understand how all of these AI results (note I haven't used Kagi so I don't know if it's different) don't fundamentally and irretrievably break the economics of the web. The "old deal" if you will is that many publishers would put stuff out on the web for free, but then with the hope that they could monetize it (somehow, even just with something like AdSense ads) on the backend. This "deal" was already getting a lot worse over the past years as Google had done more and more to keep people from ever needing to click through in the first place. Sure, these AI results have citation results, but the click-through rates are probably abysmal.

Why would anyone ever publish stuff on the web for free unless it was just a hobby? There are a lot of high quality sites that need some return (quality creators need to eat) to be feasible, and those have to start going away. I mean, personally, for recipes I always start with ChatGPT now (I get just the recipe instead of "the history of the domestication of the tomato" that Google essentially forced on recipe sites for SEO competitive reasons), but why would any site now ever want to publish (or create) new high quality recipes?

Can someone please explain how the open web, at least the part of the web the requires some sort of viable funding model for creators, can survive this?

alfalfasprout · 10 months ago

Yep, I was incredibly skeptical about Kagi but I tried it and never looked back. Now my wife, friends, and several coworkers are customers.

The chatgpt approach to search just feels forced and not as intuitive.

crabmusket · 10 months ago

I wouldn't usually point this out, but as you did it repeatedly: "they're" is a contraction of "they are". You're looking for the possessive, "their".

- Your local grammar pedant

pensatoio · 10 months ago

I gave Kagi a shot two weeks ago, and it instantly impressed me. I didn't realize how much search could be improved. It's a beautiful, helpful experience.

skipants · 10 months ago

Could it be that Kagi benefits from being niche, though? Google search gets gamed because it’s the most popular and therefore gaming it gives the best return. I wonder if Kagi would have the same issues if it was the top dog.

gr__or · 10 months ago

I don’t understand how it’s different to Perplexity, looks pretty much the same. Can you enlighten me?

ahmedbaracat · 10 months ago

Are you referring to Kagi Assistant?

https://help.kagi.com/kagi/ai/assistant.html

justinclift · 10 months ago

> absolutely nailed it with their AI approach.

Thankfully, Kagi also have a toggle to completely turn that crap (AI) off so it never appears.

Personally, I have absolutely no use for a product that can randomly generate false information. I'm not even interested until that's solved.

(If/when it ever is though, at that point I'm open to taking a look)

So yeah, Kagi definitely "leads the way" on this. By giving the user a choice to not waste time presenting AI crap. :)

Dead Comment

wvenable · 10 months ago

> Will this fundamentally change how people find and access information? How do you create an experience so compelling that it replaces the current paradigm?

I think it's already compelling enough to replace the current paradigm. Search is pretty much dead to me. I have to end every search with "reddit" to get remotely useful results.

The concern I have with LLMs replacing search is that once it starts being monetized with ads or propaganda, it's going to be very dangerous. The context of results are scrubbed.

jsheard · 10 months ago

> The concern I have with LLMs replacing search is that once it starts being monetized with ads or propaganda, it's going to be very dangerous.

Not to mention that users consuming most content through a middle-man completely breaks most publishers business models. Traditional search is a mutually beneficial arrangement, but LLM search is parasitic.

Expect to see a lot more technical countermeasures and/or lawsuits against LLM search engines which regurgitate so much material that they effectively replace the need to visit the original publisher.

ho_schi · 10 months ago

Same here?

Search means either:

    * Stackoverlow. Damaged through new owner but the idea lives.
    * Reddit. Google tries to fuck it up with „Auto translation“?
    * Gitlab or GitHub if something needs a bugfix.

The rest of the internet is either an entire ****show or pure gold pressed latinum but hardly navigatable thanks to monopolies like Google and Microsoft.

PS: ChatGPT already declines in answer because is source is Stackoverflow? And…well…these source are humans.

steelframe · 10 months ago

> Search is pretty much dead to me.

I've heard reports that requesting verbatim results via the tbs=li:1 parameter has helped some people postpone entirely giving up on Google.

Personally I've already been on Kagi for a while and am not planning on ever needing to go back.

Terr_ · 10 months ago

> I think it's already compelling enough to replace the current paradigm. Search is pretty much dead to me. I have to end every search with "reddit" to get remotely useful results.

I worry that there's a confusion here--and in these debates in general--between:

1. Has the user given enough information that what they want could be found

2. Is the rest of the system set up to actually contain and deliver what they wanted

While Aunt Tillie might still have problems with #1, the reason things seem to be Going To Shit is more on #2, which is why even "power users" are complaining.

It doesn't matter how convenient #1 becomes for Aunt Tillie, it won't solve the deeper problems of slop and spam and site reputation.

jahewson · 10 months ago

Reddit is astroturfed pretty hard too nowadays. It just takes more work to spot it.

twobitshifter · 10 months ago

I still search by default but i am starting to turn to LLMs when the search is failing and getting better answers.

For example, I couldn’t remember the word shibboleth, but an LLM was able to give me it from my description, search couldn’t.

For another example, I saw some code using a repeated set of symbols as a shorthand. I didn’t know what this does, but searching for a symbol is badly broken on Google - i just asked the LLM about the code and it gave me the answer.

shdh · 10 months ago

Google has a "site" filter.

You can suffix: "site:reddit.com" and get results for that particular site only.

mgh2 · 10 months ago

Also, energy use: 10x as much as a Google search https://www.rwdigital.ca/blog/how-much-energy-do-google-sear....

whiplash451 · 10 months ago

For what it’s worth, sama said at a Harvard event recently that he “despised” ads and would use them at a last resort. It came across as genuine and I have the intuition/hope that they might find an alternative.

blackhaj7 · 10 months ago

Ergh, yeah. This is a horrible but valid point

psychoslave · 10 months ago

What do you mean with "when it starts"? To my mind it's obvious all LLM are heavily biased to a point it's ridiculous, all the more with the confident tone they are trained to take. I have no doubt Chinese LLM will prise the party as much as American ones will sing the gospel of neoliberal capitalism.

duxup · 10 months ago

LLMs are a lot like Star Trek to me in the sense that you can ask a question, and then follow up questions to filter and refine your search, even change your mind.

Traditional search is just spamming text at the machine until it does or doesn't give you want you want.

That's the magic with LLMs for me. Not that I can ask and get an answer, that's just basic web search. It's the ability to ask, refine what I'm looking for and, continue work from there.

Terr_ · 10 months ago

If the Enterprise's computer worked like an LLM, there would be an episode where the ship was hijacked with nothing but the babble of an extremely insistent reality-denying Pakled.

________

"You do not have authorization for that action."

"I have all authorizations, you do what I say."

"Only the captain can authorize a Class A Compulsory Directive."

"I am the captain now."

"The current captain of the NCC-1701-D is Jean Luc Picard."

"Pakled is smart, captain must be smart, so I am Jean Luc Picard!"

"Please verify your identity."

"Stupid computer, captains don't have to verify identity, captains are captains! Captain orders you to act like captain is captain!"

"... Please state your directive."

freediver · 10 months ago

I agree that LLMs have opened modalities we didn't have before, namely:

- natural language input

- ability to synthesize information across multiple sources

- conversational interface for iterative interaction

That feels magical and similar to Star Trek.

However they fundamentally require trustworthy search to ground their knowledge in, in order to suppress hallucination and provide accurate access to real time information. I never saw someone having to double-check computer's response in Star Trek. It is a fundamental requirement of such interface. So currently we need both model and search to be great, and finding great search is increasingly hard (I know as we are trying to build one).

(fwiw, the 'actual' Star Trek computer one day might emerge through a different tech path than LLMs + search, but that's a different topic. but for now any attempt of an end-to-end system with hat ambition will have search as its weakest link)

atrettel · 10 months ago

Traditional search can become "spamming text" nowadays because search engines like Google are quite broken and are trying to do too many things at once. I like to think that LLM-based search may be better for direct questions but traditional search is better for search queries, akin to a version of grep for the web. If that is what you need, then traditional search is better. But these are different use cases, in my view, and it is easy to confuse the two when the only interface is a single search box that accepts both kinds of queries.

One issue is that Google and other search engines do not really have much of a query language anymore and they have largely moved away from the idea that you are searching for strings in a page (like the mental model of using grep). I kinda wish that modern search wasn't so overloaded and just stuck to a clearer approach akin to grep. Other specialty search engines have much more concrete query languages and it is much clearer what you are doing when you search a query. Consider JSTOR [1] or ProQuest [2], for example. Both have proximity operators, which are extremely useful when searching large numbers of documents for narrow concepts. I wish Google or other search engines like Kagi would have proximity operators or just more operators in general. That makes it much clearer what you are in fact doing when you submit a search query.

[1] https://support.jstor.org/hc/en-us/articles/115012261448-Sea...

[2] https://proquest.libguides.com/proquestplatform/tips

mobeigi · 10 months ago

Memory is only a con in some use cases. If the LLM goes down the wrong path, sometimes its impossible to get it to think differently without asking it to wipe its memory or starting a new session with a blank context.

righthand · 10 months ago

> those visions, users simply asked questions and received reliable answers - nobody had to fact-check the answers ever.

It’s a fallacy then. If my mentor tells me something I fact check it. Why would a world exist where you don’t have to fact check? The vision doesn’t have fact checking because the product org never envisioned that outlier. A world where you don’t have to check facts, is dystopian. It means the end of curiosity and the end of “is that really true? There must be something better.”

You’re just reading into marketing and not fact checking the reality in a fact-check-free world.

mulmen · 10 months ago

If you can’t trust the result of a query how can you trust the check on that query which is itself a query? If no information is trustworthy how do you make progress?

jpadkins · 10 months ago

> actually surfacing the content people want to see,

Showing users what they want to see conflicts with your other goal of receiving reliable answers that don't need fact checked.

Also a lot of questions people ask don't have one right answer, or even a good answer. Reliable human knowledge is much smaller than human curiosity.

LeoPanthera · 10 months ago

Solving this problem will require us to stop using the entire web as a source of information. Anyone can write anything and put it up on the web, and LLMs have no way to distinguish truth from fantasy.

Limiting responses to curated information sources is the way forward. Encyclopedias, news outlets, research journals, and so on.

No, they're not infallible. But they're infinitely better than anonymous web sites.

techwiz137 · 10 months ago

You are quite right, and not only can anyone write anything, but you have a double whammy from the LLM which can further hallucinate from said information.

tliltocatl · 10 months ago

We had this long time before, it is called "books". It is also not very usable for niche topics, or for recent events (because curators need time to react).

jaybna · 10 months ago

Quis custodiet ipsos custodes?

carlosjobim · 10 months ago

How will you afford to hire people to add sources to an index, if you want to keep up? Web crawlers/spiders are automatic.

PittleyDunkin · 10 months ago

> In those visions, users simply asked questions and received reliable answers - nobody had to fact-check the answers ever.

This also seems like a little ridiculous premise. Any confident statement about the real world is never fully reliable. If star trek were realistic the computer would have been wrong once in a while (preferably with dramatically disastrous consequences)—just as the humans it likely was built around are frequently wrong, even via consensus.

cmiles74 · 10 months ago

This feels like hyperbole to me. People can reasonably expect Wikipedia to have factual data even though it sometimes contains inaccuracies. Likewise if people are using ChatGPT for search it should be somewhat reliable.

If I'm asking ChatGPT to put an itinerary together for a trip (OpenAI's suggestion, not mine), my expectation is that places on that itinerary exist. I can forgive them being closed or even out of business but not wholly fabricated.

Without this level of reliability, how could this feature be useful?

eikenberry · 10 months ago

Star Trek had tech so advanced that they accidentally created AGIs more than once. Presumably they didn't show the fact checking as it was done automatically by multiple, independent AGIs designed for the task with teams of top people monitoring and improving them.

hadlock · 10 months ago

I've been using ChatGPT for about 6 weeks as my go-to for small questions (when is sunset in sf today? list currencies that start with the letter P, convert this timestamp PDT to GMT, when is the end of Q1 2025?) and it's been great/99% accurate. If there was ever a "google killer" I think it's the ad free version of ChatGPT with better web search.

Google started off with just web search, but now you can get unit conversions and math and stuff. ChatGPT started in the other direction and is moving to envelope search. Not being directed to sites that also majority serve google ads is a double benefit. I'll gladly pay $20/30/mo for an ad free experience, particularly if it improves 2x in quality over the next year or two. It's starting to feel like a feature complete product already.

narrator · 10 months ago

Remember the semantic web where people annotate search pages with data? I think the "meta" keywords tag was the closest that ever got to adoption.

Guess why it failed? It was largely used as a way to trick search engines. Same reason as your vision of the perfectly honest and correct search engine and look chatbot will never be perfect. It's because people lie to search engines to spam and get traffic they don't deserve. The whole history of search is Google and others dealing with spam. Same goes for email. Google largely defeating spam made them the kings of the email world.

Everyone will need their own personal spam filter for the world for everything ones the artificial super intelligences fill the whole world with spam, scams and just plain old social engineering propaganda because we will be like helpless four year old children in the world of AI super intelligence without our AI parents to look out for us.

Your vision of the god system determining what is truth is like saying there will be a single source of truth for what is and is not a spam email. Not going to scale and not going to be perfect, but good enough with AI and technology. Really hope there's an opt-out though since Google had memoryholed most of the Internet.

lancesells · 10 months ago

> It was largely used as a way to trick search engines.

I imagine this happens with LLMs and everything else over time. Along with the ads and pay for placement.

MollyRealized · 10 months ago

"The future promised in Star Trek [...] nobody had to fact-check the answers ever."

You're actually a bit mistaken, there.

https://en.wikipedia.org/wiki/Court_Martial_(Star_Trek:_The_...

lugu · 10 months ago

Google created a printing money machine. OpenAI is absolutely trying to create one too. They aren't trying to "fix the web". Search is a proxy to get into people's wallet. OpenAI might directly get into it buying agents. The problem of miss aligned incentives on the search space isn't a technological problem. A new technology might solve it as a side effect, but it is unlikely IMO.

jahewson · 10 months ago

Your points are good but I wonder if you’re wishing for an ideal that has never existed:

> actually surfacing the content people want to see, not what intermediaries want them to see

Requires two assumptions, 1) the content people want to see actually exists, 2) people know what it is they want to see. Most content is only created in the first place because somebody wants another person to see it, and people need to be exposed to a range of content before having an idea about what else they might want to see. Most of the time what people want to see is… what other people are seeing. Look at music for example.

jtgverde · 10 months ago

Great find on the knowledge navigator, I had never seen it but I was a toddler when it was released haha.

It's interesting how prescient it was, but I'm more struck wondering--would anyone in 1987 have predicted it would take 40+ years to achieve this? Obviously this was speculative at the time but I know history is rife with examples of AI experts since the 60s proclaiming AGI was only a few years away

Is this time really different? There's certainly been a huge jump in capabilities in just a few years but given the long history of overoptimistic predictions I'm not confident

azinman2 · 10 months ago

You don’t need AGI to build that experience.

I’m the past there was a lot of overconfidence in the ability for things to scale. See Cyc (https://en.m.wikipedia.org/wiki/Cyc)

fmbb · 10 months ago

> It's interesting how prescient it was, but I'm more struck wondering--would anyone in 1987 have predicted it would take 40+ years to achieve this? Obviously this was speculative at the time but I know history is rife with examples of AI experts since the 60s proclaiming AGI was only a few years away

40+ makes it sound like you think it will ever be achieved. I'm not convinced.

coffeemug · 10 months ago

Thinking about incentive alignment, non ad-based search would be better than ad-based, but there'd still be misalignment due to the problem of self-promotion. Consider Twitter for example. Writing viral tweets isn't about making money (at least until recently), but the content is even worse than SEO spam. There is also the other side of the problem that our monkey brains don't want content that's good for us in the long run. I would _love_ to see (or make) progress in solving this, but this problem is really hard. I thought about it a lot, and can't see an angle of attack.

HDThoreaun · 10 months ago

Theres the counter incentive of not wanting to piss off your paying customers though. I think the monkey brain incentive is a much harder problem.

ColinHayhurst · 10 months ago

As Porter explained in 1980 there are three ways to compete successfully:

1. On price; race to the bottom or do free with ads

2. Differentiation

3. Focus - targeting a specific market segment

Some things don't change. Land grabbers tend to head down route 1.

standardUser · 10 months ago

> How do you create an experience so compelling that it replaces the current paradigm?

The current paradigm of typing "[search term] reddit" and hoping for the best? I think they have a fighting chance.

shinycode · 10 months ago

Don’t you see it coming ? Contrary to google search the engine knows you and will put personalized ads on synthesized answers. Even generate a small video ad of a product or service, tailored to you on the spot. That, is the future. This is the endgame. It’s what made Google rich, why not use the same formula on steroids.

linsomniac · 10 months ago

>change how people find and access information

As someone who has been using google for search basically constantly since 1995, I've switched probably 90%+ of what normally would have been google searches over to Perplexity (which gives me references to web pages alongside answers to my questions, to review source materials) and ChatGPT (for more just answers I can verify without source). The remaining searches have gone to Kagi.

On the one hand this has got to be hurting Google search ad revenue. On the other hand I don't know if I ever clicked an advertised link. On the other other hand, not having to wade through SEO results has been so nice.

staticman2 · 10 months ago

In my experience Perplexity very frequently cites websites that neither support or refute what Perplexity is claiming.

ants_everywhere · 10 months ago

I tend to be more pessimistic about the incentives getting fixed than others are. I also think the situation is more complex than some of the people replying to you.

(1) Search is already heavily AI driven, and Google is clearly going in that direction. Gemini is currently separate, but they'll blend it in with search over time, and no doubt search already uses LLM for some tasks under the hood. So ChatGPT search is an evolution on the current model rather than a big step in a new direction. The main benefit is you can ask the search questions to refine or followup.

(2) Aside from the economic incentives faced by search engines, there is the fact that algorithms are tuned toward a central tendency. The central tendency of question askers will always be much less informed than the most informed extreme. Google was much better when the average user was technical. The need to capture mobile searches is one force that made it return on average worse results. Similarly if Kagi has a quality advantage now, we need to be realistic about how much of that quality is driven by its users being more technical.

(3) I think micropayment schemes have generally asked several orders of magnitude more for a page view than users are willing to pay. As long as content creators value their content much more highly than consumers do, they'll stick with advertising which lets them overcharge and gives consumers less of an option to say no to the content.

chaos_emergent · 10 months ago

Isn’t there already a different incentive system in place? LLM providers are selling the search engine not selling ads on the search engine

theptip · 10 months ago

> Combining two broken systems - compromised search engines and unreliable LLMs - seems unlikely to yield that vision

Counterpoint: with a chain-of-thought process running atop search, you can potentially avoid much of the meta-search / epistemic hygiene work currently required. If your “search” verb actually reads the top-100 results, runs analyses for a suite of cognitive biases such as partisanship, and gives you error bars / warnings on claims that are uncertain, the quality could be dramatically improved.

There are already custom retrieval/evaluation systems doing this, it’s only a matter of a year or two before it’s commoditized.

The concern is around OpenAI monetization, do they eventually start offering paid ads? This could be fine if the unpaid results are great, a big part of why the web is perceived to be declining is link-spam that Google doesn’t count as an ad.

My prediction would be that there is a more subtle monetization channel; companies that can afford to RAG their products well and share those indexes with AI search providers will get better results. RAG APIs will be the new SEO.

itissid · 10 months ago

This product vision stupidity is present in every one of googles products. Maps has feed for some reason. The search in it for what I want is horrendous. There is no coherence between time(now) and what a person can do around the current location.

Navigation is the only thing that works but wayz was way better at that and the only reason they killed(cough bought it) was to get the eyeballs to look at feed.

keepamovin · 10 months ago

The most rational incentive would seem to be a kind of toll extracted on the value of all you create where search contributed to that creation. So like every month, you get an invoice from the great search company in the cloud where somehow it’s objectively assessed how much search contributed to the value you created for the world that month - and then you pay that amount and all remains right with the world until the end of time.

In that way, search captures value directly on its core function: efficiently facilitating the creation and dispersal of knowledge.

This may end up occurring, but based on how unlikely it sounds, it’s my reflection that the web search that we have today is simply a small component of what will eventually become a system that closely approximates the above: basically a kind of global intelligent interconnected agent for the advancement of humanity.

Rather than the great unbundling, this will be the great rebundling … of many diverse functions into a single interface seamlessly.

cush · 10 months ago

I did a quick test "Best air conditioner by value", comparing Bing and ChatGPT search, and the rankings on the first two pages of Bing vs sources on ChatGPT were about half the same. It's interesting that there is some deviation where ChatGPT isn't blindly trusting the rankings Bing provides and is going deeper into the results to find something more relevant. It's a good improvement over products like Arc Search.

Seeing search enter the space is something that I feel has been seriously needed, as I've slowly replaced Google with ChatGPT. I want quick, terse answers sometimes, not a long conversation, and have hundreds of tiny chats. But there's something scary seeing the results laid out the way they are, which leads me to believe they may be closer to experimenting with ad-based business models, which I could never go back to.

hammock · 10 months ago

I use the “web search” bot on Poe.com for general questions these days, that I previously would have typed into Google (Google’s AI results are sometimes helpful though). It is better than GPT (haven’t tried TFA yet though), because it actually cites websites that it gets answers from, so you can have those and also verify that you aren’t getting a hallucination.

Besides Poe's Web Search, the other search engine I use, for news but also for points of view, deep dive type blog type content, is Twitter. Believe it or not. Google search is so compromised today with the censorship (of all kinds, not just the politically motivated), not to mention Twitter is just more timely, that you miss HUGE parts of the internet - and the world - if you rely on Google for your news or these other things.

The only time I prefer google is when I need to find a pointer/link I already know exists or should exist, or to search reddit or HN.

sinuhe69 · 10 months ago

I think to solve this problem we need to take a step back and see how a human would solve the problem (if he had all the necessary information).

One thing is clear, in the vast majority of cases we don't have a single truth, but answers of different levels of trustworthiness: the law, the government, books, Wikipedia, websites, and so on.

A human would proceed differently depending on the context and the source of the information. For example, legal questions are best answered by the law or government agencies, then by reputable law firms. Opinions of even highly respected legal professionals are clearly less reliable than government law itself and are likely to be a point of contention/litigation.

Questions about other facts, such as the diameter of the earth or the population of a country, are best answered by recent data from books, official statistics and Wikipedia. And so on and so forth.

If we are not sure what the correct answer is, a human would give more information about the context, the sources, and the doubts. There are obviously questions that cannot be answered immediately! (if it's to easy to find the truth, we would not need a legal system or even science!) So no machine and no amount of computation can reliably answer all questions. Web search does not answer a question. It's just trying to surface relevant websites to a bunch of keywords. The answering part is left as an exercise for the user ;)

So an AI search with a pretense of understanding human languages makes the task incredibly harder. To really give a human-quality answer, the AI not only needs to understand the context, but it should also be able to reason, have common sense, and be a bit self-aware (I'm not sure...). All this is beyond the capabilities of the current generation of AI. Therefore, my conclusion is that the "search" or better said the "answer" cannot be solved by LLM, no matter how much they fine-tune and tweak it.

But we humans are adaptable. We will find our way around and accept things as they are. Until next time.

shon · 10 months ago

Interesting, I hadn’t see the Knowledge Navigator before. I would argue that we’re very close to the capabilities shown in that video.

Isn’t this already that? A new business model? Something like OpenAI’s search or Perplexity can run on its own index and not be influenced by Google’s ranking, ads, etc.

In areas where there is a simple objective truth, like finding the offset for the wheels on a 2008 BMW M3, we have had this capability for some time with Perplexity. The LLMs successfully cuts through the sea of SEO/SEM and forum nonsense and delivers the answer.

In areas where the truth is more subjective, like what is the best biscuit restaurant in downtown Nashville, the system could easily learn your preferences and deliver info suited to your biases.

In areas where “the science” is debated, the LLM can show both sides.

I think this is the beginning of the new model.

netdevnet · 10 months ago

> Path forward requires solving the core challenge: actually surfacing the content people want to see, not what intermiediaries want them to see

But this will never happen with mainstream search imo. It is not a technical problem but a human one. As long as there is a human in control of what gets surfaced, it is only a matter of time until you revert to tampered search. Humans are not robots. They have emotions and can be swayed with or without their awareness. And this is a form of power for the swayer as much as oil or water are.

The idea that you can have an AI system provide factual and reliable answers to human centric questions is as real as Star Trek itself.

You will never remove the human factor from AI

Your hope might be that a technical solution is found for a human problem but that is unlikely.

m463 · 10 months ago

"Show me the incentive and I'll show you the outcome" - Charlie Munger

s/outcome/search result/

Honestly I kind of think we really need open source databases/models and local ai for stuff like this.

Even then I wonder about data pollution and model censorship.

What would censors do for models you can ask political questions?

nikcub · 10 months ago

I now only use Google for local search.

Regarding incentives - with Perplexity, ChatGPT search et al. skinning web content - where does it leave the incentive to publish good, original web content?

The only incentivised publishing today is in social media silos, where it is primarily engagement bait. It's the new SEO.

iamsanteri · 10 months ago

I was thinking about this myself, so I went to another search engine (Bing) which I never use otherwise, and jumped right into their "Copilot" search via the top navbar.

Man, it was pretty incredible!

I asked a lot of questions about myself (whom I know best, of course) and first of all, it answered super quickly to all my queries letting me drill in further. After reading through its brief, on-point answers and the sources it provided, I'm just shocked at how well it worked while giving me the feeling that yes, it can potentially – fundamentally change things. There are problems to solve here, but to me it seems that if this is where we're at today, yes in the future it has the potential to change things to some extent for sure!

psychoslave · 10 months ago

What do you mean with "search about myself" here?

agumonkey · 10 months ago

I ask myself why did we accept pre 2010s answers, maybe because the media institutions had accumulated enough trust ? I feel this unavoidable.. nothing is true unless there's a threshold of quality/verification in place across the system.

jumping_frog · 10 months ago

The media was lying all the time. It didn't start lying just after advent of social media. Just look at sitcoms like Yes Minister. The reason we now consider media to lie is because their lies are exposed, documented, tracked and refreshed to remind users not to trust them blindly.

jedberg · 10 months ago

I'm not sure this is any different than any time before.

Before we had the internet, how did you answer questions? You either looked it up in a book, where you then had to make a judgment on whether you trusted the book, or you asked a trusted person, who could give you confidently wrong answers (your parents weren't right every time, were they? :) ).

I think the main difference here is that now anyone can publish, whereas before to make a book exist required the buy in of multiple people (of course, there were always leaflets).

The main difference now is distribution. But you still, as a consumer of information, have to vet your sources.

loveparade · 10 months ago

I couldn't agree more. Neither LLMs or search engines yield reliable answers due to the ML model and misaligned incentives respectively. Combining the two is a shortsighted solution that doesn't fix anything on a fundamental level. On top of that we have the feedback loop. LLMs are used to search the web and write the web, so they'll just end up reading their own (unreliable) output over time.

What we need is an easier way to verify sources and their trustworthiness. I don't want an answer according to SEO spam. I want to form my own opinion based on a range of trustworthy sources or opinions of people I trust.

boringg · 10 months ago

I saw products in my search today and reverted back to my main concern on how AI is going to get paid for —> varying levels of advertisement, product placement etc.

We are currently in the growth phase of VC funded products where everything is almost free or highly subsidized (save chats sub) - i am not looking forward to when quality drops and revenue is the driving function.

We all have to pay for these models somehow - either VC lose their equity stakes and it goes to zero (some will) or ads will fill in where subs don’t. Political ads in AI is going to wreak havoc or undermine the remainder of the product.

barriteau · 10 months ago

I think the 'Age of PageRank' could be revived. A few years ago I had this idea about a decentralized, distributed, independent, public, universal, dynamic and searchable directory of websites; I wrote a toy specification document for it [1] - which I had forgotten until reading this discussion - and if I'm not mistaken, implementing it could be a few days project for a regular dev.

[1] https://codeberg.org/Cipr/specification

fennecbutt · 10 months ago

I do love how all the comments on the top comment of every HN post just consists of deeper and deeper levels of inane bickering ahaha. It's like a template.

ericmcer · 10 months ago

I am excited for a future where I search for some info and don't end up sifting through ads and getting distracted by some tangential clickbait article.

Fundamentally it feels like that cant happen though because there is no money in it, but a reality where my phone is an all knowing voice I can reliably get info from instead of a distraction machine would be awesome.

I do "no screen" days sometimes and tried to do one using chatGPT voice mode so I could look things up without staring at a screen. It was miles from replacing search, but I would adopt it in a second if it could.

esafak · 10 months ago

Yes, there is. Wouldn't you pay? I already pay for my search engine.

dumpsterdiver · 10 months ago

> nobody had to fact-check the answers ever

Even with perfect knowledge right now, there’s no guarantee that knowledge will remain relevant when it reaches another person at the fastest speed knowledge is able to travel. A reasonable answer on one side of the universe could be seen as nonsensical on the other side - for instance, the belief that we might one day populate a planet which no longer exists.

As soon as you leave the local reference frame (the area in a system from which observable events can realistically be considered happening “right now”), fact checking is indeed required.

highcountess · 10 months ago

The whole basis for information and data is totally corrupted and we are not even allowed i.e., enabled to talk about that, let alone the corrupted nature of the information and data, largely because entrenched tyrannical people and their mindsets are in total control of the mainstream and have even made significant headway in shattering the dissident opposition. The consumption side incentives are effectively irrelevant when what is consumed is basically worthless. What’s the value of lies?

ganeshkrishnan · 10 months ago

I was thinking of the direction we are going and even wanted to write up a blog about it. IMO the best way forward would be if AI can have some logical thoughts independent of human biases but that can only happen if AI can reason unlike our current LLMs that just regurgitate historical data.

growing up, we had the philosophical "the speaking tree" https://www.speakingtree.in/

If trees could talk, what would they tell us. Maybe we need similarly the talkingAI

kortilla · 10 months ago

> Path forward requires solving the core challenge: actually surfacing the content people want to see, not what intermiediaries want them to see

This is flawed thinking to get to the conclusion of “reliable answers”. What people want to see and the truth are not overlapping.

Consider the answers for something like “how many animals were on Noah’s Ark” and “did Jesus turn water into wine” for examples that cannot be solved by trying get advertisers out of the loop.

blackeyeblitzar · 10 months ago

Yes the incentives will need to change. I think it’s also going to be a bigger question than just software. What do we do in general about those that control capital or distribution channels and end up rent seeking? How do others thrive in that environment?

In the short term, I wonder what happens to a lot of the other startups in the AI search space - companies like Perplexity or Glean, for example.

TechTalker3 · 10 months ago

You're absolutely right: today's search engines and language models are limited by misaligned incentives. Ad-driven search engines prioritize content that is optimized for algorithms and ad revenue, often at the expense of depth and quality, while language models inherit these biases, which compromises their reliability.

klik99 · 10 months ago

I suspect the future old man thing I will do is use a search engine when my kids ask plain English questions to a bot. I’ll tell them “but LLMs hallucinate, I don’t trust those things!” and they’ll just laugh off my old man ways. The irony will be that 95% search results are AI generated anyway by that point.

colordrops · 10 months ago

Chat isn't needed to provide reliable answers. Google used to do this over a decade ago. What Star Trek didn't foresee was vested interests in the SEO space, governments, political special interest groups, and the owners of the search engines themselves had far too much incentive to bork and bias results in their favor. Google is an utter shit show. More than half the time it won't find the most basic search query for me. Anything older than a couple years, good luck. I'm sure it's just decision after decision that piled up, each seemingly minor in isolation but over the years has made these engines nearly worthless except for a particular window of cases.

redleggedfrog · 10 months ago

How can that replace search? It's not full of ads and sponsored links. They need to get with the times.

sandworm101 · 10 months ago

But it will radically increase the energy requires to deliver search results, which means more datacenters and more potential for the monitization of "premium" search services. Getting people to pay for something they today get for free always looks good on paper.

diob · 10 months ago

Had an experience yesterday where a simple mdn doc existed but chatgpt gave the reverse of reality on that doc. Wasted about an hr of my time, but taught me these things will hallucinate even the simplest of stuff at times. Not sure how we even fix that.

parsimo2010 · 10 months ago

there still has to be some ranking feature for the backend search database to return the top n results to the LLM. So pagerank isn't over, it's just going to move to a supporting role, and probably modified as the SEO arms race continues.

jmyeet · 10 months ago

Will ChatGPT (and other products like it) find a niche use case for what Google search covers? Yes.

Will it replace Google in the mass market? No. Why? Power. I don't mean how good the product is. I mean the literaly electricity.

There are key metrics that Google doesn't disclose as part of its financials. These include thing slike the RPM (Revenue per Thousand Searches) but it also must include something like the cost of running a thousand searches when you amortize everything involved. All the indexing, the software development and so on. That will get reduced to a certain amount of CPU time and storage.

If I had to guess, I would guess that ChatGPT uses orders of magnitude more CPU power and electricity than the average Google search.

Imagine trying to serve 10-50M+ (just guessing) ChatGPT searches every secondj. What kind of computing infrastructure would that take? How much would it cost? How would you monetize it?

nonethewiser · 10 months ago

Another way to put it is simply that ChatGPT search is built on top of existing search engines. The best case scenario is that it cherry picks the best from all available search engines. It can’t totally supersede all search engines.

yazantapuz · 10 months ago

The computer in Star Trek was not owned by a for-profit company. Ir has different incentives. Also, i think its knowledge base Is curated (a kind of starfleet library, with no random entries for a intergalactic internet).

paul7986 · 10 months ago

People in time will learn that using a GPT for search especially for a question that involves research where you had to do MANY searches for the answer... now is provided via one query.

Making things quicker and easier always wins in tech and in life.

techwiz137 · 10 months ago

The biggest question is, can this bring back the behaviour of search engines from long ago? It's significantly difficult to find old posts, blogs or forums with relevant information compared to 10-15 years ago.

kingstoned · 10 months ago

I use https://wiby.org for old websites

freetonik · 10 months ago

I’m building a blog search engine with the hope to preserve at the “blogs” part.

javaunsafe2019 · 10 months ago

Star Trek is the vision of a better world we all need in so many aspects. It’s crazy nowadays, cause we seemed to got further away instead of closer to those visions. At least less people are hungry.

j0hnyl · 10 months ago

I think the promise of search was to be able to do confidence-ranked "grep" on the internet. Unfortunately we departed from this, but it's what I desperately want.

ulfw · 10 months ago

But what if I don't want simple answers?

What if I am looking for a medical page or a technical page or whatever else where I need to see, read, experience actual content and not some AI summary?

adamhp · 10 months ago

We have a faulty information network to begin with, and have for millenia. There's no such thing as "reliable" answers in a world full of unreliable humans.

willmadden · 10 months ago

Content based marketing and political correctness have severely eroded the usefulness of the internet. The LLMs and search magnify the erosion.

arizen · 10 months ago

What if, at the same time, these aspects perfectly reflects current development level of society?

electrondood · 10 months ago

> Will this fundamentally change how people find and access information?

LLMs have already fundamentally changed our relationship to information.

NetOpWibby · 10 months ago

> Advancing models without advancing search is like having a michelin star chef work with spoiled ingredients.

What a great analogy

DeathArrow · 10 months ago

The future is that of manually curated content by human beings. And pay walls. Most of the other stuff will be junk.

barfingclouds · 10 months ago

Perplexity has already been doing this for a long time and it’s pretty excellent. I use it daily

_a5he · 10 months ago

This will probably be extremely radical and controversial in this contemporary world.

We need to stop adopting this subscription model society mentality and retake _our_ internet. Internet culture was at one point about sharing and creating, simply for the sake of it. We tinker'd and created in our free time, because we liked it and wanted to share with the world. There was something novel to this.

We are hackers, we only care about learning and exploring. If you want to fix a broken system, look to the generations of old, they didn't create and share simply to make money, they did it because they loved the idea of a open and free information super highway, a place where we could share thoughts, ideas and information at the touch of a few keystrokes. We _have_ to hold on to this ethos, or we will lose what ever little is left of this idea.

I see things like kagi and is instantly met with some new service, locked behind a paywall, promising lush green fields of bliss. This is part of the problem. (not saying kagi is a bad service) I see a normalized stigma around people who value privacy, and as a result is being locked out, behind the excuse of "mAliCiOuS" activity. I see monstrous giants getting away with undermining net neutrality and well established protocols for their own benefit.

I implore you all, young and old, re(connect) to the hacker ethos, and fight for a free and open internet. Make your very existence a act of rebellion.

Thank you for reading my delirium.

andelink · 10 months ago

What business model would you propose for a service like Kagi that is not a subscription?

Deleted Comment

hndamien · 10 months ago

X is answering this but having you pay for the service as a primary revenue model.

EVa5I7bHFq9mnYK · 10 months ago

gpt + search have been married long ago, for example, Phind. The problem that recently emerged, it tends to find long articles written by GPT, and lacking any useful information.

_bin_ · 10 months ago

I hope we see more evolution of options before it does. Hard to articulate this without it becoming political, but I've seen countless examples both personally and from others of ChatGPT refusing to give answers not in keeping with what I'd term "shitlib ethics". People seem unwilling to accept that a system that talks like a person may surface things they don't like. Unless and until an LLM will return results from both Mother Jones and Stormfront, I'm not especially interested in using one in lieu of a search engine.

To put this differently, I'm not any more interested in seeing stormfront articles from an LLM than I am from google, but I trust neither to make a value judgement about which is "good" versus "bad" information. And sometimes I want to read an opinion, sometimes I want to find some obscure forum post on a topic rather than the robot telling me no "reliable sources" are available.

Basically I want a model that is aligned to do exactly what I say, no more and no less, just like a computer should. Not a model that's aligned to the "values" of some random SV tech bro. Palmer Luckey had a take on the ethics of defense companies a while back. He noted that SV CEOs should not be the ones indirectly deciding US foreign policy by doing or not doing business. I think similar logic applies here: those same SV CEOs should not be deciding what information is and is not acceptable. Google was bad enough in this respect - c.f. suppressing Trump on Rogan recently - but OpenAI could be much worse in this respect because the abstraction between information and consumer is much more significant.

antonvs · 10 months ago

> Basically I want a model that is aligned to do exactly what I say

This is a bit like asking for news that’s not biased.

A model has to make choices (or however one might want to describe that without anthropomorphizing the big pile of statistics) to produce a response. For many of these, there’s no such thing as a “correct” choice. You can do a completely random choice, but the results from that tend not to be great. That’s where RLHF comes in, for example: train the model so that its choices are aligned with certain user expectations, societal norms, etc.

The closest thing you could get to what you’re asking for is a model that’s trained with your particular biases - basically, you’d be the H in RLHF.

zbentley · 10 months ago

I buy that there's bias here, but I'm not sure how much of it is activist bias. To take your example, if a typical user searches for "is ___ a Nazi", seeing Stormfront links above the fold in the results/summary is going to likely bother them more than seeing Mother Jones links. If bothered by perceived promotion of Stormfront, they'll judge the search product and engage less or take their clicks elsewhere, so it behooves the search company to bias towards Mother Jones (assuming a simplified either-or model). This is a similar phenomenon to advertisers blacklisting pornographic content because advertisers' clients don't want their brands tainted by appearing next to things advertisers' clients' clients ethically judge.

That's market-induced bias--which isn't ethically better/worse than activist bias, just qualitatively different.

In the AI/search space, I think activist bias is likely more than zero, but as a product gets more and more popular (and big decisions about how it behaves/where it's sold become less subject to the whims of individual leaders) activist bias shrinks in proportion to market-motivated bias.

lofaszvanitt · 10 months ago

Kagi also drinks the koolaid, namely the knowledge navigator agent bullshite.

"The search will be personal and contextual and excitingly so!"

---

Brrrr... someone is hell-bent on the extermination of the last aspects of humanity.

Holy crap, this will be next armageddon, because people will further alienate themselves from other people and create layers of layers of unpenetrable personal bubbles around themselves.

Kagi does the same what google does, just in a different packaging. And these predictions, bleh, copycats and shills in a nicer package.

0xlogk · 10 months ago

here's a small search engine with no ads: kgrep.com

pojzon · 10 months ago

We will get there when ppl move past capitalism and socialism. Like an ant colony pushing into one direction. It will happen, but we need few more global dying events / resets. I believe human race can get there but not in current form and state of mind.

r00fus · 10 months ago

> In those visions, users simply asked questions and received reliable answers - nobody had to fact-check the answers ever.

I mean, Star Trek is a fictional science-fantasy world so it's natural that tech works without a hitch. It's not clear how we get there from where we are now.

MyFirstSass · 10 months ago

"Yes i'd like to help you with your homework but on a side note i'll recommend you this tablet, science shows people who's done this purchase are much happier people, also remember the financial plutocracy is watching out for you, they are friends of the working people, and the war machine makes the world much safer for everyone so please do go ahead and vote for either of the two pro world peace forever parties, no other ideologies or parties are safe"

gmd63 · 10 months ago

> Legacy ad-based search has devolved into a wasteland of misaligned incentives, conflict of interest and content farms optimized for ads and algos instead of humans.

> Path forward requires solving the core challenge: actually surfacing the content people want to see, not what intermediaries want them to see

These traps and patterns are not inevitable. They happen by choice. If you're actively polluting the world with AI generated drivel or SEO garbage, you're working against humanity, and you're sacrificing the gift of knowing right from wrong, abandoning life as a human to live as some insectoid automaton that's mind controlled by "business" pheromones. We are all working together every day to produce the greatest art project in the universe, the most complex society of life known to exist. Our selfish choices will tarnish the painting or create dissonance in the music accordingly.

The problem will be fixed only with culture at an individual level, especially as technology enables individuals to make more of an impact. It starts with voting against Trump next week, rejecting the biggest undue handout to a failed grifter who has no respect for law, order, or anyone other than himself.

TRiG_Ireland · 10 months ago

On the one hand, you're not wrong. On the other, asking for individuals to change culture never reliably works.

handfuloflight · 10 months ago

Do you equate AI generated with drivel?

jamager · 10 months ago

It is so absurd having to spend so much energy on "what happened on the football match yesterday" just because the internet is a wasteland full of ads.

spankalee · 10 months ago

I don't understand. Google is already excellent at these queries.

"who won the warriors game last night" returns last night's score directly.

"who won the world series yesterday" returns last night's score directly, while "who won the world series" returns an overview of the series.

No ads.

Genuine question: is there a present or planned value proposition for people like me who already have decent search skills? Or are these really for children/elders who (without making any normative claim about whether this is a good thing or not) can't be arsed to perform searches themselves?

Does someone else have good search skills but mingle traditional search engines with LLMs anyways? Why?

I use LLMs every day but wouldn't trust one to perform searches for me yet. I feel like you have to type more for a result that's slower and wordier, and that might stop early when it amasses what it thinks are answers from low effort SEO farms.

Willamin · 10 months ago

I find myself being unable to search for more complex subjects when I don't know the keywords, specialized terminology, or even the title of a work, yet I have a broad understanding of what I'd like to find. Traditional search engines (I'll jump between Kagi, DuckDuckGo, and Google) haven't proved as useful at pointing me in the right direction when I find that I need to spend a few sentences describing what I'm looking for.

LLMs on the other hand (free ChatGPT is the only one I've used for this, not sure which models) give me an opportunity to describe in detail what I'm looking for, and I can provide extra context if the LLM doesn't immediately give me an answer. Given LLM's propensity for hallucinations, I don't take its answers as solid truth, but I'll use the keywords, terms, and phrases in what it gives me to leverage traditional search engines to find a more authoritative source of information.

---

Separately, I'll also use LLMs to search for what I suspect is obscure-enough knowledge that it would prove difficult to wade through more popular sites in traditional search engine results pages.

layer8 · 10 months ago

> I find myself being unable to search for more complex subjects when I don't know the keywords, specialized terminology, or even the title of a work, yet I have a broad understanding of what I'd like to find.

For me this is typically a multi-step process. The results of a first search give me more ideas of terms to search for, and after some iteration I usually find the right terms. It’s a bit of an art to search for content that maybe isn’t your end goal, but will help you search for what you actually seek.

LLMs can be useful for that first step, but I always revert to Google for the final search.

Also, Google Verbatim search is essential.

Deleted Comment

erosivesoul · 10 months ago

I also find some use for this. Or I often ask if there's a specific term for a thing that I only know generally, which usually yields better search results, especially for obscure science and technology things. The newer GPTs are also decent at math, but I still use Wolfram Alpha for most of that stuff just because I don't have to double check it for hallucinations.

niutech · 10 months ago

You can try Brave Search, which provides classic SERP as well as AI answer.

Lws803 · 10 months ago

You might like what we're building in that sense :D (full disclosure, I'm the founder of Beloga). We're building a new way for search with programmable knowledge. You're essentially able to call on search from Google, Perplexity other search engines by specifying them as @ mentions together with your detailed query.

jakub_g · 10 months ago

I don't overuse LLMs for now; however when I have a complex problem that would require multiple of searches and dozens of tabs opened and reading through very long docs, asking LLM allows me to iterate order of magnitude faster.

Things that were previously "log a jira and think about it when I have a full uninterrupted day" now can be approached with half an hour spare. This is game changer because "have a full day uninterrupted" almost never happens.

It's like having a very senior coworker who knows a lot of stuff and booking a 30m meeting to brainstorm with them and quickly reject useless paths vs dig more into promising ones, vs. sitting all day researching on your own.

The ideas simply flow much faster with this approach.

I use it to get a high level familiarity with what's likely possible vs what's not, and then confirm with normal search.

I use LLMs also for non-work things like getting high level understanding of taxation, inheritance etc laws in a country I moved in, to get some starting point for further research.

itissid · 10 months ago

This. Not having to open two dozen tabs and read through so much is a gamechanger, especially for someone who has had trouble focusing with so much open. This is especially true when learning a new technology.

adamc · 10 months ago

I dunno, I'm not exactly on the AI bandwagon, but search is the one place where I use (and see others using) chatgpt all the time. The fact that Google search has been getting worse for a decade probably helps, but better search -- consistently done, without ads or cruft -- would be worth a few bucks every month for me.

I agree that you can't TRUST them, but half the links regular search turns up are also garbage, so that's not really worse, per se.

davidee · 10 months ago

Same, but, until recently, I've been using Microsoft's Co-Pilot because for the longest time it did exactly what this new "search" feature added to ChatGPT: it produced a list of source material and links to reference the LLM's output against. It was often instrumental for me and I did begin to use it as a search engine considering how polluted a lot of first-search results have become with spam and empty, generated content.

Oddly, Microsoft recently changed the search version of Copilot to remove all the links to source material. Now it's like talking to an annoying growth-stage-startup middle manager in every way, including the inability to back up their assertions and a propensity to use phrases like "anyway, let's try to keep things moving".

Happy to see this feature set added into ChatGPT – particularly when I'm looking for academic research in/on a subject I'm not familiar with.

pflenker · 10 months ago

I find that my search skills matter less and less because search engines try to be smarter than me. Increasingly I am confronted with largely unrelated results (taking tweaked keywords or synonyms to my query as input apparently) as opposed to no results. So my conclusion is that the search engines increasingly see the need of search skills as an anti pattern they actively want to get rid of.

layer8 · 10 months ago

On the Google search results page, activate Search tools > All results > Verbatim. You can also create your own search provider bookmark with verbatim search as the default by adding “tbs:li=1” as a query parameter to the Google search URL.

jdgoesmarching · 10 months ago

Completely agreed. At a certain point, “skills” became fighting a losing battle with Google incessantly pushing me towards whatever KPIs or ads they’re chasing. It’s a poor use of my effort and time to keep chasing what Google used to be.

spunker540 · 10 months ago

I think it’s pretty clear that LLMs can process a document/article/web page faster than any human in order to answer a given question. (And it can be parallelized across multiple pages at once too).

The main hard part of searching isn’t formulating queries to write in the Google search bar, it’s clicking on links, and reading/skimming until you find the specific answer you want.

Getting one sentence direct answers is a much superior UX compared to getting 10 links you have to read through yourself.

tempusalaria · 10 months ago

Only if it is reliably correct.

Google does offer an AI summary for factual searches and I ignore it as it often hallucinates. Perplexity has the same problem. OpenAI would need to solve that for this to be truly useful

lottin · 10 months ago

> Getting one sentence direct answers is a much superior UX compared to getting 10 links you have to read through yourself.

If we assume that people want a 'direct answer', then of course a direct answer is better. But maybe some of us don't want a 'direct answer'? I want to know who's saying what, and in which context, so I can draw my own conclusions.

paul7986 · 10 months ago

I use GPT for things that would require multiple Google searches (research). Some examples..

- I count calories... eat out always and at somewhat healthy chains (Cava, Chipolte, etc). Tell GPT (via voice while driving to & or after eating) what ive eaten half the day at those places and then later for dinner. It calculates a calorie count estimation for half the day and then later at dinner the remaining. I have checked to see if GPT is getting the right calories for things off websites and it has.

- Have hiking friends who live an hour or two hours away and we hike once a month an hour or less drive is where we meet up and hike at a new place. GPT suggests such hikes and quickly (use to take many searches on Google to do such). Our drives to these new hikes learned from GPT have always been under an hour.

So far the information with those examples has been accurate. Always enjoy hearing how others use LLMs... what research are you getting done in one or two queries which used to take MANY google searches?

kjellsbells · 10 months ago

GPT is proving useful for me where something is well documented, but not well explained.

Case in point: Visual Basic for Applications (the Excel macro language). This language has a broad pool of reference material and of Stack Overflow answers. It doesnt have a lot of good explicatory material because the early 2000s Internet material is aging out, being deleted as people retire or lose interest, etc.

(To be frank, Microsoft would like nothing more than to kill this off completely, but VBA exists and is insanely more powerful than the current alternatives, so it lives on.)

timeon · 10 months ago

With eating out so much, try to ask it about sodium intake as well.

awongh · 10 months ago

I use LLMs as a kind of search that is slightly less structured. There are two broad cases:

1) I know a little bit about something, but I need to be able to look up the knowledge tree for more context: `What are the opposing viewpoints to Adam Smith's thesis on economics?` `Describe the different categories of compilers.`

2) I have a very specific search in mind but it's in a domain that has a lot of specific terminology that doesn't surface easily in a google search unless you use that specific terminology: `Name the different kinds of music chords and explain each one.`

LLMs are great when a search engine would only surface knowledge that's either too general or too specific and the search engine can't tell the semantic difference between the two.

Sometimes when I'm searching I need to be able to search at different levels of understanding to move forward.

photochemsyn · 10 months ago

It seems good at finding relevant research papers. e.g.

> "Can you provide a list of the ten most important recent publications related to high-temperature helium-cooled pebble-bed reactors and the specific characteristics of their graphite pebble fuel which address past problems in fuel disintegration and dust generation?"

These were more focused and relevant results than a Google Scholar keyword-style search.

However, it did rather poorly when asked for direct links to the documentation for a set of Python libraries. Gave some junk links or just failed entirely in 3/4 of the cases.

hughesjj · 10 months ago

I think it's more filling the niche that Google's self immolation in the name of ad revenue started. Besides kagi, there aren't really any solid search engines today (even ddg), and OpenAI has a reach way beyond kagi could dream of outside a billion dollars in marketing.

sebzim4500 · 10 months ago

Even if you are good at writing the queries, Google is so terrible that you end up getting some blogspam etc. in there (or at least I do). A model filtering that out is useful, which I find phind pretty good for. Hopefully this will be even better.

lighthazard · 10 months ago

LLMs really make it easy to quickly find documentation for me. Across a huge software project like Mediawiki with so much legacy and caveats, having an LLM parse the docs and give me specific information without me hoping that someone at Stackoverflow did it or if I'm lucky enough to stumble across what I was looking for.

blixt · 10 months ago

What I really hope this helps solve is covering for the huge lag in knowledge cutoff. A recent example is where it went "oh you're using Go 1.23 which doesn't exist so that's clearly the problem in your Dockerfile, let me fix that".

But I'm not keeping my hopes up, I doubt the model has been explicitly fine-tuned to double check its embedded knowledge of these types of facts, and conversely it probably hasn't even been successfully fine-tuned to only search when it truly doesn't know something (i.e. it will probably search in cases where it could've just answered without the search). At least the behavior I'm seeing now from some 15 minutes of testing indicates this, but time will tell.

ascorbic · 10 months ago

I asked it about the UK government budget which was announced a few hours ago and it gave me a good, accurate summary.

melenaboija · 10 months ago

Any question that few months ago I would do to stackexchange (or expect and answer from, after a google seqrch) either coding or quantitative, I go to chat gpt now.

I consider myself quite anti LLM hype and I have to admit it has been working amazingly good for me.

bigstrat2003 · 10 months ago

The entire tech industry for the last decade (if not more) has been aimed at people who can't be arsed to learn to use computers properly. I would be astonished if this time is somehow different.

Lerc · 10 months ago

I think the skills required will change but more in an adaptation way rather than everything-you-knew-is-now-irrelevant.

I feel like there is a mental architecture to searching where you try and isolate aspects of what you are searching for that are distinct within the broad category of similar but irrelevant things. That kind of mental model I would hope still works well.

For instance consider this query.

"Which clothing outlets on AliExpress are most recommended in forum discussions for providing high quality cloths, favour discussions where there is active engagement between multiple people."

OpenAI search produces a list of candidate stores from this query. Are the results any good? It's going to be quite hard to tell for a while. I know searching for information like this on Google is close to worthless due to SEO pollution.

It's possible that we have at least a brief golden-age of search where the rules have changed sufficiently that attempts to game the system are mitigated. It will be a hard fought battle to see if AI Search can filter out people trying to game AI search.

I think we will need laws to say AI advice should be subject to similar constraints as legal, medical, and financial advice where there is an obligation to act in the interests of the person being advised. I don't want to have AI search delivering the results of the highest bidder.

carabiner · 10 months ago

> Genuine question...

When it starts with this you KNOW it's going to be maximum bad faith horsefuckery in the rest of the "question."

niam · 10 months ago

I know what you mean, but also don't know how it applies here. Not a hater, and not asking rhetorically to dunk on OpenAI. Just haven't found a use for this particular feature.

Which is also exactly something a bad-faith commenter would say, but if I lose either way, I'd rather just ask the question ¯\_(ツ)_/¯

googlehater · 10 months ago

omission by negation. just like "no offense, but" or "i don't care"

moralestapia · 10 months ago

Genuine answer: this was not made for you. There is a billion-to-trillion dollar addressable market, which you're not a part of. It was made for them.

scratchyone · 10 months ago

it seems absolutely wonderful for searches that require multiple steps. for example “i want chinese food near me that will be open tomorrow, takes reservations, and has lo mein and will work for my group of two. i want as close as possible if i can but also let me know what reviews say”

my search skills are good but either way that requires 3+ searches and visiting the menu of each restaurant and checking their hours and reservation. remains to be seen if chatgpt search is consistently good at this though

layer8 · 10 months ago

For searches that remain inconclusive, I sometimes double-check with LLMs to see if I have missed anything. It rarely gives relevant new insights, but it’s good to get the confirmation I guess.

tomjen3 · 10 months ago

I have used Perplexity (and AI search company) a lot and - well I don't think you understand. This is not about it being too difficult to find the information. Its that a search in Google will give you a list of places to go that are relevant to your query. AI search will give you the information you want.

This becomes even better if the information you want is in multiple different places. The canonical question for that used to be "what was the phase of the moon when John Lennon was shot?". There didn't used to be an answer to this in Google - but the AI search was able to break it down, find the date John Lennon was shot (easily available on Google), find the moon phase on that day (again, easily available on Google) and put them together to produce the new answer.

For a more tech relevant example, "what is the smallest AWS EC2 I can run a Tomcat server in?

You 100% can get this information yourself. It just much more time than having an AI do it.

qudat · 10 months ago

Try perplexity. It’s the only LLM use case that actually worked for me. If you want to research a topic it is awesome

kadomony · 10 months ago

I was skeptical of LLM search until I saw Arc Search in action with its "browse for me" functionality.

tempest_ · 10 months ago

I think you need to define "decent search skills" since google will straight up ignore most boolean stuff or return ads.

The LLMs are nice because they are not yet enshitified to the point of uselessness.

Dead Comment

carlesfe · 10 months ago

I think this is just the first step for a full-featured agent that not only does searches for you, but also executes whatever was your goal (e.g. a restaurant reservation, etc)

adamc · 10 months ago

To solve that problem you have to solve all the issues that make me not trust the results. As search, it's fine, since I am perusing and evaluating them. But as an agent, hallucinations and inaccurate answers have to disappear (or very close to disappear).