Readit News logoReadit News
cle commented on Child's Play: Tech's new generation and the end of thinking   harpers.org/archive/2026/... · Posted by u/ramimac
smallmancontrov · 23 days ago
If you cured 100% of all cancer it would only reduce US deaths by 20%. Clearly we should conclude that cancer isn't a problem and isn't worth curing, and also that heart disease and unintentional injuries and so on are also not problems and also not worth trying to fix.
cle · 23 days ago
GP didn't say it's not a problem and not worth fixing. They're claiming this is not a good fix.

Deleted Comment

cle commented on I’m leaving Redis for SolidQueue   simplethread.com/redis-so... · Posted by u/amalinovic
dns_snek · 2 months ago
The issue is that "83 per second" is multiple orders of magnitude off the expected level of performance on any RDBMS running on anything resembling modern hardware.

I haven't worked with Graphile but this just doesn't pass the sniff test unless those 83 jobs per second are somehow translating into thousands of write transactions per second.

Their documentation has a performance section with a benchmark that claims to process 10k jobs per second on a pretty modest machine, as an indication.

cle · 2 months ago
> The issue is that "83 per second" is multiple orders of magnitude off the expected level of performance on any RDBMS running on anything resembling modern hardware.

This is just not true, there are so many scenarios where 83/sec would be the limit. That number by itself is almost meaningless, similar to benchmarks which also make a bunch of assumptions about workloads and runtime environments.

As a simple example if your queue has a large backlog, you have a large worker fleet aggressively pulling work to minimize latency, your payloads are large, you have not optimized indexing, and/or you have many jobs scheduled for the future, every acquire can be an expensive table scan.

(This is a specific example because this is one of many failure scenarios I’ve encountered with Graphile that can cause your DB to meltdown. The same workload in Redis barely causes a blip in Redis CPU, without having to fiddle with indexes and auto vacuuming and worker backoffs.)

cle commented on Americans Overwhelmingly Support Science, but Some Think the U.S. Is Lagging   scientificamerican.com/ar... · Posted by u/beardyw
ActorNightly · 2 months ago
>Americans Overwhelmingly Support Science

Considering on the average 7/10 people either voted for Trump or didn't vote, (with Trump openly stating that he wants to neuter universities) Americans "think" they support science.

cle · 2 months ago
Americans think "supporting universities" is not necessarily the same as "supporting science."
cle commented on I’m leaving Redis for SolidQueue   simplethread.com/redis-so... · Posted by u/amalinovic
dns_snek · 2 months ago
Facing issues with 83 jobs per second (5k/min) sounds like an extreme misconfiguration. That's not high throughput at all and it shouldn't create any appreciable load on any database.
cle · 2 months ago
This comes up every time this conversation occurs.

Yes, PG can theoretically handle just about anything with the right configuration, schema, architecture, etc.

Finding that right configuration is not trivial. Even dedicated frameworks like Graphile struggle with it.

My startup had the exact same struggles with PG and did the same migration to BullMQ bc we were sick of fiddling with it instead of solving business problems. We are very glad we migrated off of PG for our work queues.

cle commented on HTTPS by default   security.googleblog.com/2... · Posted by u/jhalderm
kevstev · 4 months ago
There are dozens of us I guess that care about this kind of thing. I have never really understood the obsession with https for static content that I don't care if anyone can see I am reading like a blog post. HTTPS should be for things that matter, everything else can, and think should use HTTP when it is not necessary.

Depending on yet another third party to provide what is IMHO a luxury should not be required, and I have been continually confused as to why it is being forced down everyone's throat.

cle · 4 months ago
There are good arguments for it, but it's also not a coincidence that they happen to align with Google's business objectives. Ex it's hard to issue a TLS cert without notifying Google of it.
cle commented on DeepSeek OCR   github.com/deepseek-ai/De... · Posted by u/pierre
pietz · 5 months ago
My impression is that OCR is basically solved at this point.

The OmniAI benchmark that's also referenced here wasn't updated with new models since February 2025. I assume that's because general purpose LLMs have gotten better at OCR than their own OCR product.

I've been able to solve a broad range of OCR tasks by simply sending each page as an image to Gemini 2.5 Flash Lite and asking it nicely to extract the content in Markdown under some additional formatting instructions. That will cost you around $0.20 for 1000 pages in batch mode and the results have been great.

I'd be interested to hear where OCR still struggles today.

cle · 5 months ago
That will not work with many of the world's most important documents because of information density. For example, dense tables or tables with lots of row/col spans, or complex forms with checkboxess, complex real-world formatting and features like strikethroughs, etc.

To solve this generally you need to chunk not by page, but by semantic chunks that don't exceed the information density threshold of the model, given the task.

This is not a trivial problem at all. And sometimes there is no naive way to chunk documents so that every element can fit within the information density limit. A really simple example is a table that spans hundreds pages. Solving that generally is an open problem.

cle commented on Launch HN: Extend (YC W23) – Turn your messiest documents into data   extend.ai/... · Posted by u/kbyatnal
serjester · 5 months ago
This is the most confusing pricing page I’ve ever seen - different options have different credit usage and different cost per credits? How many degrees of freedom do you real need to represent API cost.
cle · 5 months ago
> How many degrees of freedom do you real need to represent API cost.

The amount that your users care about.

At a large enough scale, users will care about the cost differences between extraction and classification (very different!) and finding the right spot on the accuracy-latency curve for their use case.

Deleted Comment

cle commented on MCP overlooks hard-won lessons from distributed systems   julsimon.medium.com/why-m... · Posted by u/yodon
whoknowsidont · 7 months ago
Can you please diagram out, using little text arrows ("->"), what you think is happening so I can just fill in the gap for you?
cle · 7 months ago
I write these as part of my job, I know how they work. I'm not going to spend more time explaining to you (and demonstrating!) what is in the spec. Read the spec and let the authors know that they don't understand what they wrote. I've run out of energy in this conversation.

u/cle

KarmaCake day5459March 12, 2012
About
contact me: cle@clehn.com
View Original