Search could be so much better. And I don't mean chatbots with web access

maximumgeek · 9 months ago

This is just an ad without any context?

Search could be better? Yes, yes it could.

I search for words, can even indicate I want search results with a keyword included and it will be ignored. And then I have to sift between what is the search result, and what is an ad.

And if I get another quora answer....

But, this post? it was a waste. We do some hand wavy stuff, come try us.

mfkhalil · 9 months ago

Fair point, I probably should have provided more context in the post.

MatterRank uses LLMs to rank pages based on criteria you provide it with, not SEO tricks. It’s not meant to replace Google, but helps when you're looking for something specific and don't want to wade through tons of results that you don't care about. Still early, but useful for deeper searches.

luke-stanley · 9 months ago

Transformer models like BERT have been lurking in Google search for a few years now (and non-transformer language models before that). The distinction between LLM and chatbot is pretty thin. Dismissing "chatbots with web access" when you are actually using LLMs is not a very clear or useful differentiation, even if the way you use a LLM really is different. More end-user control over results is a good thing, but there is an opportunity to engage much more clearly.

lyu07282 · 9 months ago

But often the problem "is" google, garbage in, garbage out

cyanydeez · 9 months ago

Search could be better if it was a paid service and the content got paid.

Dead Comment

SteveDavis88 · 9 months ago

I want a search engine that only returns results containing words I specify. Is that asking too much? Google is not that search engine.

joe5150 · 9 months ago

I think google several years ago had gotten very good at matching on related concepts, but it just fell off a cliff after that.

bobek · 9 months ago

Maybe try Kagi. I am pretty happy with the results.

saltysalt · 9 months ago

I felt the same frustration: I just want keyword matching without any filtering. I'm building https://greppr.org/ to scratch that itch.

al_borland · 9 months ago

Isn’t that what quotes do?

nixpulvis · 9 months ago

I feel stupid for asking, but do quotes even do anything anymore? I feel like I try them and it just gives me the same results.

pseudalopex · 9 months ago

No. But it's what verbatim mode does.

genewitch · 9 months ago

remember when "Human speech" -robots -alien worked? those were the days. I guess there's just too much data now to search stuff like that.

esperent · 9 months ago

> I guess there's just too much data now to search stuff like that.

That seems extremely unlikely as the reason.

It's far more likely that some executives looked at the numbers and decided that removing search operators would make people more likely to click on ads, while leaving them in would make people click on the actual results that they were searching for.

new_user_final · 9 months ago

Use Google verbatim mode.

https://www.google.com/search?q=beer&tbs=li:1

robertlagrant · 9 months ago

That is unbelievably better. The ads are even off to the side!

danpalmer · 9 months ago

> It assumes we don't know what we want.

Does it? I understand there are issues with spam in search, but assuming we don't know what we want is not at all the conclusion I draw from using search engines.

mfkhalil · 9 months ago

Yeah that's fair, "doesn't know what we want" might have been oversimplifying. Better phrasing would have been that there is a very hard limit on the context you're able to give when using a search engine. It's mainly keywords, and then maybe some tricks like `site:` or quotes.

danpalmer · 9 months ago

I think you're right that there's limited context, but I'd still disagree on "doesn't know what you want". I think search engines know what users want within the scope of the context they're able to provide. There are two issues with that, one is the deeper examples you gave in another reply may be better, and the other being differentiation between legitimate search matches, and bad actors trying to match for things they shouldn't do.

For the former, I'm intrigued but unconvinced that it's what I actually want in a search engine.

For the latter, I imagine that's something that this search engine will need to contend with, although it could "just" be an LLM compute trade-off, where if you give enough results to an LLM to analyse you'll eventually find the good stuff. That said, SEO is going to rapidly become LLMEO and ruin the day again.

mfkhalil · 9 months ago

Credit to @ziftface — I should’ve included more examples in the original post. MatterRank is useful when you want results with specific qualitative traits that go beyond keyword matching. You can ask for stuff like “written by a woman,” “mentions these specific lines from a movie,” or “talks about X/Y/Z but avoids A/B.” Since it reads the full content, not just metadata or SEO signals, it lets you be a lot more precise in ways that traditional search engines just don’t support.

renegat0x0 · 9 months ago

I have been playing with idea of one big SQLite for domains. I can search it relatively fast, find things related to "amiga", "emulator" etc.

https://github.com/rumca-js/Internet-Places-Database

I must admit, that this is a difficult task. There are many domains for "hotels", "casinos", so I have to protect myself, just as google agains spam.

AymanJabr · 9 months ago

Too many steps, why do I have to signup? Why do I have to create an engine.

Remove all of this, just let me directly use your app, I want to search and create engines on the fly.

I don't need to save them for future uses, if I am not going to use your app even once.

If you want this to take off, it needs to just work, no extra steps unless I want to.

janalsncm · 9 months ago

Using an LLM isn’t the worst way to rank, but it’s pretty darn slow. The speed could be improved a lot by just distilling into deep neural nets though.

The results for me were fairly high quality and moderately relevant but I think they could be improved as well.

You get pretty far by just blocking low quality blogspam and Medium, which would be a lot faster and could even be done on the frontend with a chrome plugin.

mfkhalil · 9 months ago

Yeah LLMs were the easiest way to get a proof of context running, but replacing it with a specialized distilled model/classifier should hopefully make it way quicker.

As for the results, it's tough because we've made the deliberate decision to have no control over the reranking. What that means is that if your criteria is "written by a woman", for instance, then any result that meets that will be ranked equally at the top. In all engines I've built for myself, I have a relevance criteria that's weighted relative to how much I care that the result is exactly what I'm looking for. It's probably important to make that clearer to the end user.

BrenBarn · 9 months ago

I mean, it's not just search that assumes we don't know what we want. A huge amount of technology these days has shifted to telling us what to want rather than letting us obtain what we have independently decided we want.

mfkhalil · 9 months ago

I actually completely agree with this. Search is a good example, but in general it seems that general consensus has become that consumers don't know what they want, which is pretty frustrating, and probably a product of the success of the TikTok algorithm and similar software.

I'm hoping that as LLMs become more mainstream more functionality is built into tech that doesn't treat consumers as idiots. This is one stab at it, but there's so many other opportunities imo.

BrenBarn · 9 months ago

How do you think that LLMs will help that?