Readit News logoReadit News
patrickhogan1 commented on Making LLMs Cheaper and Better via Performance-Efficiency Optimized Routing   arxiv.org/abs/2508.12631... · Posted by u/omarsar
cubefox · 2 days ago
Based on my experience, the GPT-5 router either isn't very smart or is deliberately configured to be very stingy. It basically never uses the reasoning model by itself, even if that means it hallucinates nonsense.
patrickhogan1 · a day ago
Same experience as you.
patrickhogan1 commented on Weaponizing image scaling against production AI systems   blog.trailofbits.com/2025... · Posted by u/tatersolid
patrickhogan1 · 3 days ago
This issue arises only when permission settings are loose. But the trend is toward more agentic systems that often require looser permissions to function.

For example, imagine a humanoid robot whose job is to bring in packages from your front door. Vision functionality is required to gather the package. If someone leaves a package with an image taped to it containing a prompt injection, the robot could be tricked into gathering valuables from inside the house and throwing them out the window.

Good post. Securing these systems against prompt injections is something we urgently need to solve.

Deleted Comment

patrickhogan1 commented on Closer to the Metal: Leaving Playwright for CDP   browser-use.com/posts/pla... · Posted by u/gregpr07
hugs · 4 days ago
speaking of priors... sauce labs existed for three whole years before browserstack (selenium and sauce founder here. :-)

i like that there are new startups in the space, though. things were getting pretty stale and uninspired.

patrickhogan1 · 4 days ago
Sauce Labs is excellent. I've actually used it extensively myself (not sure why BrowserStack came to mind first). I remember Sauce Labs was super active in the SF Selenium community and the Selenium meetups. Just checked my emails. Good memories.

Thank you for building Selenium.

patrickhogan1 commented on Closer to the Metal: Leaving Playwright for CDP   browser-use.com/posts/pla... · Posted by u/gregpr07
nikisweeting · 4 days ago
It was, but I feel like the advent of headless browsers marked a step function explosion in browser automation. Also any earlier than 2010 is when I was like 13yo, so it's more like "the dark ages in my own memory" than "objectively dark ages in automation history".
patrickhogan1 · 4 days ago
I get that drawing historical boundaries is arbitrary, but Selenium is a really good prior.

Selenium offered headless mode and integrated with 3rd party providers like BrowserStack, which ran acceptance tests in parallel in the cloud. It seems like what browser-use.com is doing is a modern day version with many more features & adaptability.

patrickhogan1 commented on Closer to the Metal: Leaving Playwright for CDP   browser-use.com/posts/pla... · Posted by u/gregpr07
patrickhogan1 · 4 days ago
Selenium was very usable before 2011.

This post is like saying Grafana and not mentioning Nagios

patrickhogan1 commented on Obsidian Bases   help.obsidian.md/bases... · Posted by u/twapi
patrickhogan1 · 6 days ago
Obsidian is amazing
patrickhogan1 commented on ArchiveTeam has finished archiving all goo.gl short links   tracker.archiveteam.org/g... · Posted by u/pentagrama
lyu07282 · 6 days ago
> This would mean there is an "official" source of all web data. LLM people can use snapshots of this

that already exists, its called CommonCrawl:

https://commoncrawl.org/

patrickhogan1 · 6 days ago
Common Crawl, while a massive dataset of the web does not represent the entirety of the web.

It’s smaller than Google’s index and Google does not represent the entirety of the web either.

For LLM training purposes this may or may not matter, since it does have a large amount of the web. It’s hard to prove scientifically whether the additional data would train a better model, because no one (afaik) not Google not common crawl not Facebook not Internet Archive have a copy that holds the entirety of the currently accessible web (let alone dead links). I’m often surprised using GoogleFu at how many pages I know exist even with famous authors that just don’t appear in googles index, common crawl or IA.

patrickhogan1 commented on Model intelligence is no longer the constraint for automation   latentintent.substack.com... · Posted by u/drivian
themanmaran · 7 days ago
Sure, concrete example. We do conversational AI for banks, and spend a lot of time on the compliance side. Biggest thing is we don't want the LLM to ever give back an answer that could violate something like ECOA.

So every message that gets generated by the first LLM is then passed to a second series of LLM requests + a distilled version of the legislation. ex: "Does this message imply likelihood of credit approval (True/False)". Then we can score the original LLM response based on that rubric.

All of the compliance checks are very standardized, and have very little reasoning requirements, since they can mostly be distilled into a series of ~20 booleans.

patrickhogan1 · 6 days ago
Thank you! Great example!

u/patrickhogan1

KarmaCake day941October 3, 2018View Original