Readit News logoReadit News
yxhuvud commented on I'm too dumb for Zig's new IO interface   openmymind.net/Im-Too-Dum... · Posted by u/begoon
HumanOstrich · 2 days ago
What's the benchmark for how long something can be pre-1.0? Seems like a nonsense argument.
yxhuvud · 2 days ago
Something can be pre-1.0 as long as there are no stability guarantees.
yxhuvud commented on How we exploited CodeRabbit: From simple PR to RCE and write access on 1M repos   research.kudelskisecurity... · Posted by u/spiridow
KingOfCoders · 5 days ago
Did I misread the article, or did they take the tool config from the PR not the repo?
yxhuvud · 5 days ago
Unfortunately that mostly has to be the case or else the developer experience configuring these would be too bad.
yxhuvud commented on MCP doesn't need tools, it needs code   lucumr.pocoo.org/2025/8/1... · Posted by u/the_mitsuhiko
jahsome · 7 days ago
Are you referring to MCP? If so, it's fully spelled out in the first sentence of the first paragraph, and links to a more thorough post on the subject. That meets 2 of the 3 criteria you've dictated.
yxhuvud · 7 days ago
That was not the case when I commented. It has obviously been updated since then.
yxhuvud commented on MCP doesn't need tools, it needs code   lucumr.pocoo.org/2025/8/1... · Posted by u/the_mitsuhiko
diggan · 7 days ago
> or at least a link to some other page that explain what is going on

There is a link to a previous post by the same author (within the first ten words even!), which contains the context you're looking for.

yxhuvud · 7 days ago
A link to a previous post is not enough, though of course appreciated. But it would be something I click on after I decide if I should spend time on the article or not. I'm not going on goose chases to figure out what the topic is.
yxhuvud commented on MCP doesn't need tools, it needs code   lucumr.pocoo.org/2025/8/1... · Posted by u/the_mitsuhiko
yxhuvud · 7 days ago
First rule of writing about something that can be abbreviated: First have some explanation so people have an idea of what you are talking about. Either type out what the abbreviation stands for, have an explanation or at least a link to some other page that explain what is going on.

EDIT: This has since been fixed in link, so it is outdated.

yxhuvud commented on PYX: The next step in Python packaging   astral.sh/blog/introducin... · Posted by u/the_mitsuhiko
x3n0ph3n3 · 11 days ago
You assume the OS package manager I happen to be using even has packages for some of the libraries I want to use.
yxhuvud · 11 days ago
Or for that matter, that the ones they do have are compatible with packages that comes from other places. I've seen language libraries be restructured when OS packagers got hold of them. That wasn't pretty.
yxhuvud commented on Leonardo Chiariglione – Co-founder of MPEG   leonardo.chiariglione.org... · Posted by u/eggspurt
mike_hearn · 18 days ago
Universities love patent licensing. I don't think academia is the solution you're looking for.
yxhuvud · 18 days ago
The solution to that is to remove the ability to patent codecs.
yxhuvud commented on So you want to parse a PDF?   eliot-jones.com/2025/8/pd... · Posted by u/UglyToad
yxhuvud · 21 days ago
Well, perhaps you are exposed only to special snowflakes of pdfs that are from a single source and somewhat well formed and easy to extract from. Other, like me, are working at companies that also have lots of PDFs, from many, many different sources, and there are no easy ways to extract structured data or even text in a way that always work.
yxhuvud commented on So you want to parse a PDF?   eliot-jones.com/2025/8/pd... · Posted by u/UglyToad
throwaway4496 · 21 days ago
Yes, and don't for a second think this approach of rastering and OCR'ing is sane, let alone a reasonable choice. It is outright absurd.
yxhuvud · 21 days ago
Noone has claimed getting structured data out of pdfs are sane. What you seem to be missing is that there are no sane ways to get a decent output. The reasonable choice would be to not even try, but business needs invalidate that choice. So what remain is the absurd ways to solve the problem.
yxhuvud commented on So you want to parse a PDF?   eliot-jones.com/2025/8/pd... · Posted by u/UglyToad
throwaway4496 · 21 days ago
So you parse PDFs, but also OCR images, to somehow get better results?

Do you know you could just use the parsing engine that renders the PDF to get the output? I mean, why raster it, OCR it, and then use AI? Sounds creating a problem to use AI to solve it.

yxhuvud · 21 days ago
Well, you clearly hasn't parsed a wide variety of pdfs. Because if you had, you had been exposed to pdfs that contain only images, or those that contain embedded text, but that embedded text is utter nonsense and doesn't match what is shown on the page when rendered.

And that is before we even get into text structure, because as everyone knows, reading text is easier if things like paragraphs, columns and tables are preserved in the output. And guess what, if you just use the parsing engine for that, then what you get out is a garbled mess.

u/yxhuvud

KarmaCake day3687October 2, 2009View Original