Readit News logoReadit News
qsort · 7 months ago
This is actually very cool. Not really replacing a browser, but it could enable an alternative way of browsing the web with a combination of deterministic search and prompts. It would probably work even better as a command line tool.

A natural next step could be doing things with multiple "tabs" at once, e.g: tab 1 contains news outlet A's coverage of a story, tab 2 has outlet B's coverage, tab 3 has Wikipedia; summarize and provide references. I guess the problem at that point is whether the underlying model can support this type of workflow, which doesn't really seem to be the case even with SOTA models.

hliyan · 7 months ago
For me, a natural next step would be to turn this into a service -- rather than doing it in the browser, this acts as a proxy, strips away all the crud and serves your browser clean text. No need to install a new browser, just point the browser to the URL via the service.

But if we do it, we have to admit something hilarious: we will soon be using AI to convert text provided by the website creator into elaborate web experiences, which end users will strip away before consuming it in a form very close to what the creator wrote down in the first place (this is already happening with beautifully worded emails that start with "I hope this email finds you well").

npmipg · 7 months ago
working on this as we speak!
TeMPOraL · 7 months ago
> tab 1 contains news outlet A's coverage of a story, tab 2 has outlet B's coverage, tab 3 has Wikipedia; summarize and provide references.

I think this is basically what https://ground.news/ does.

(I'm not affiliated with them; just saw them in the sponsorship section of a Kurzgesagt video the other day and figured they're doing the thing you described +/- UI differences.)

doctoboggan · 7 months ago
I am a ground news subscriber (joined with a Kurzgesagt ref link) and it does work that way (minus the wikipedia summary). It's pretty good and I particularly like their "blindspot" section showing news that is generally missing from a specific partisan new bubble.
simedw · 7 months ago
Thank you.

I was thinking of showing multiple tabs/views at the same time, but only from the same source.

Maybe we could have one tab with the original content optimised for cli viewing, and another tab just doing fact checking (can ground it with google search or brave). Would be a fun experiment.

myfonj · 7 months ago
Interestingly, the original idea of what we call a "browser" nowadays – the "user agent" – was built on the premise that each user has specific needs and preferences. The user agent was designed to act on their behalf, negotiating data transfers and resolving conflicts between content author and user (content consumer) preferences according to "strengths" and various reconciliation mechanisms.

(The fact that browsers nowadays are usually expected to represent something "pixel-perfect" to everyone with similar devices is utterly against the original intention.)

Yet the original idea was (due to the state of technical possibilities) primarily about design and interactivity. The fact that we now have tools to extend this concept to core language and content processing is… huge.

It seems we're approaching the moment when our individual personal agent, when asked about a new page, will tell us:

    Well, there's nothing new of interest for you, frankly:
    All information presented there was present on pages visited recently.
    -- or --
    You've already learned everything mentioned there. (*)
    Here's a brief summary: …
    (Do you want to dig deeper, see the content verbatim, or anything else?)
Because its "browsing history" will also contain a notion of what we "know" from chats or what we had previously marked as "known".

nextaccountic · 7 months ago
In your cleanup step, after cleaning obvious junk, I think you should do whatever Firefox's reader mode does to further clean up, and if that fails bail out to the current output. That should reduce the number of tokens you send to the LLM even more

You should also have some way for the LLM to indicate there is no useful output because perhaps the page is supposed to be a SPA. This would force you to execute Javascript to render that particular page though

phatskat · 7 months ago
> I was thinking of showing multiple tabs/views at the same time, but only from the same source.

I think the primary reason I use multiple tabs but _especially_ multiple splits is to show content from various sources. Obviously this is different that a terminal context, as I usually have figma or api docs in one split and the dev server on the other.

Still, being able to have textual content from multiple sources visible or quickly accessible would probably be helpful for a number of users

wrsh07 · 7 months ago
Would really love to see more functionality built into this. Handling post requests, enabling scripting, etc could all be super powerful
baq · 7 months ago
wonder if you can work on the DOM instead of HTML...

almost unrelated, but you can also compare spegel to https://www.brow.sh/

andrepd · 7 months ago
LLMs to generate SEO slop of the most utterly piss-poor quality, then another LLM to lossilly "summarise" it back. Brave new world?
bubblyworld · 7 months ago
Classic that the first example is for parsing the goddamn recipe from the goddamn recipe site. Instant thumbs up from me haha, looks like a neat little project.
andrepd · 7 months ago
Which it apparently does by completely changing the recipe in random places including ingredients and amounts thereof. It is _indeed_ a very good microcosm of what LLMs are, just not in the way these comments think.
simedw · 7 months ago
It was actually a bit worse than that the LLM never got the full recipe due to some truncation logic I had added. So it regurgitated the recipe from training, and apparently, it couldn't do both that and convert units at the same time with the lite model (it worked for just flash).

I should have caught that, and there are probably other bugs too waiting to be found. That said, it's still a great recipe.

throwawayoldie · 7 months ago
The output was then posted to the Internet for everyone to see, without the minimal amount of proofreading that would be necessary to catch that, which gives us a good microcosm of how LLMs are used.

On a more pleasant topic the original recipe sounds delicious, I may give it a try when the weather cools off a little.

bubblyworld · 7 months ago
What do you mean? The recipes in the screenshot look more or less the same, the formatting has just changed in the Spiegel one (which is what was asked for, so no surprises there).

Edit: just saw the author's comment, I think I'm looking at the fixed page

lpribis · 7 months ago
Another great example of LLM hype train re-inventing something that already existed [1] (and was actually thought out) but making it worse and non-deterministic in the worst ways possible.

https://schema.org/Recipe

bubblyworld · 7 months ago
Can we stop with the unprovoked dissing of anyone using LLMs for anything? Or at least start your own thread for it. It's an unpleasant, incredibly boring/predictable standard for discourse (more so than the LLMs themselves lol).
komali2 · 7 months ago
That's a cool schema, but the LLM solution is necessary because recipe website makers will never use the schema because they want you to have to read through garbage, with some misguided belief that this helps their SEO or something. Or maybe they get more money if you scroll through more ads?
soap- · 7 months ago
And that would be great, if anyone used it.

LLMs are specifically good at a task like this because they can extract content from any webpage, regardless of it supports whatever standard that no one implements

VMG · 7 months ago
The LLM thing actually works. Who cares if it's deterministic. Maybe the same people who come up with arcane schemas that nobody ever uses?

Deleted Comment

Deleted Comment

IncreasePosts · 7 months ago
There are extensions that do that for you, in a deterministic way and not relying on LLMs. For example, Recipe Filter for chrome. It just shows a pop up over the page when it loads if it detects a recipe
bubblyworld · 7 months ago
Thanks, I already use that plugin, actually, I just found the problem amusingly familiar. Recipe sites are the original AI slop =P
mromanuk · 7 months ago
I definitely like the LLM in the middle, it’s a nice way to circumvent the SEO machine and how Google has optimized writing in recent years. Removing all the cruft from a recipe is a brilliant case for an LLM. And I suspect more of this is coming: LLMs to filter. I mean, it would be nice to just read the recipe from HTML, but SEO has turned everything into an arms race.
tines · 7 months ago
> Removing all the cruft from a recipe is a brilliant case for an LLM

Is it though, when the LLM might mutate the recipe unpredictably? I can't believe people trust probabilistic software for cases that cannot tolerate error.

kccqzy · 7 months ago
I agree with you in general, but recipes are not a case where precision matters. I sometimes ask LLMs to give me a recipe and if it hallucinates something it will simply be taste bad. Not much different from a human-written recipe where the human has drastically different tastes than I do. Also you basically never apply the recipe blindly; you have intuition from years of cooking to know you need to adjust recipes to taste.
joshvm · 7 months ago
There is a well-defined solution to this. Provide your recipes as a Recipe schema: https://schema.org/Recipe

Seems like most of the usual food blog plugins use it, because it allows search engines to report calories and star ratings without having to rely on a fuzzy parser. So while the experience sucks for users, search engines use the structured data to show carousels with overviews, calorie totals and stuff like that.

https://recipecard.io/blog/how-to-add-recipe-structured-data...

https://developers.google.com/search/docs/guides/intro-struc...

EDIT: Sure enough, if you look at the OPs recipe example, the schema is in the source. So for certain examples, you would probably be better off having the LLM identify that it's a recipe website (or other semantic content), extract the schema from the header and then parse/render it deterministically. This seems like one of those context-dependent things: getting an LLM to turn a bunch of JSON into markdown is fairly reliable. Getting it to extract that from an entire HTML page is potentially to clutter the context, but you could separate the two and have one agent summarise any of the steps in the blog that might be pertinent.

    {"@context":"https://schema.org/","@type":"Recipe","name":"Slowly Braised Lamb Ragu ...

visarga · 7 months ago
I foreseen this a couple years ago. We already have web search tools in LLMs, and they are amazing when they chain multiple searches. But Spegel is a completely different take.

I think the ad blocker of the future will be a local LLM, small and efficient. Want to sort your timeline chronologically? Or want a different UI? Want some things removed, and others promoted? Hide low quality comments in a thread? All are possible with LLM in the middle, in either agent or proxy mode.

I bet this will be unpleasant for advertisers.

yellow_lead · 7 months ago
LLM adds cruft, LLM removes cruft, never a miscommunication
hirako2000 · 7 months ago
Do you also like what it costs you to browse the web via an LLM potentially swallowing millions of tokens per minutes ?
prophesi · 7 months ago
This seems like a suitable job for a small language model. Bit biased since I just read this paper[0]

[0] https://research.nvidia.com/labs/lpr/slm-agents/

treyd · 7 months ago
I wonder if you could use a less sophisticated model (maybe even something based on LSTMs) to walk over the DOM and extract just the chunks that should be emitted and collected into the browsable data structure, but doing it all locally. I feel like it'd be straightforward to generate training data for this, using an LLM-based toolchain like what the author wrote to be used directly.
askonomm · 7 months ago
Unfortunately in the modern web simply walking the DOM doesn't cut it if the website's content loads in with JS. You could only walk the DOM once the JS has loaded, and all the requests it makes have finished, and at that point you're already using a whole browser renderer anyway.
kccqzy · 7 months ago
Yeah but this project doesn't use JS anyway.
leroman · 7 months ago
Cool idea! but kind of wasteful.. I just feel wrong if I waste energy.. At least you could first turn it into markdown with a library that preserves semantic web structures (I authored this- https://github.com/romansky/dom-to-semantic-markdown) saving many tokens = much less energy used..
otabdeveloper4 · 7 months ago
This is exactly the sort of thing that should be running on a local LLM.

Using a big cloud provider for this is madness.

clbrmbr · 7 months ago
Suggestion: add a -p option:

    spegel -p "extract only the product reviews" > REVIEWS.md

__MatrixMan__ · 7 months ago
It would be cool of it were smart enough to figure out whether it was necessary to rewrite the page on every visit. There's a large chunk of the web where one of us could visit once, rewrite to markdown, and then serve the cleaned up version to each other without requiring a distinct rebuild on each visit.
myfonj · 7 months ago
Each user have distinct needs, and has a distinct prior knowledge about the topic, so even the "raw" super clean source form will probably be eventually adjusted differently for most users.

But yes, having some global shared redundant P2P cache (of the "raw" data), like IPFS (?) could possibly help and save some processing power and help with availability and data preservation.

__MatrixMan__ · 7 months ago
I imagine it sort of like a microscope. For any chunk of data that people bothered to annotate with prompts re: how it should be rendered you'd end up with two or three "lenses" that you could toggle between. Or, if the existing lenses don't do the trick, you could publish your own and, if your immediate peers find them useful, maybe your transitive peers will end up knowing about them as well.
markstos · 7 months ago
Cache headers exist for servers to communicate to clients how long it safe to cache things for. The client could be updated to add a cache layer that respects cache headers.
simedw · 7 months ago
If the goal is to have a more consistent layout on each visit, I think we could save the last page's markdown and send it to the model as a one-shot example...
pmxi · 7 months ago
The author says this is for “personalized views using your own prompts.” Though, I suppose it’s still useful to cache the outputs for the default prompt.
__MatrixMan__ · 7 months ago
Or to cache the output for whatever prompt your peers think is most appropriate for that particular site.
kelsey98765431 · 7 months ago
People here are not realizing that html is just the start. If you can render a webpage into a view, you can render any input the model accepts. PDF to this view. Zip file of images to this view. Giant json file into this view. Whatever. The view is the product here, not the html input.