Readit News logoReadit News
soiltype commented on Doing the thing is doing the thing   softwaredesign.ing/blog/d... · Posted by u/prakhar897
soiltype · 12 days ago
A bit of a meta lesson for me here: Writing a short, pointed, opinionated blog post is blogging. If I care about blogging my thoughts, I need to just do it, not worry about rigor or depth ahead of time
soiltype commented on GPTZero finds 100 new hallucinations in NeurIPS 2025 accepted papers   gptzero.me/news/neurips/... · Posted by u/segmenta
DSMan195276 · 18 days ago
> Presumably, there's a step in this process where money incentivizes the opposite of my suggestion, and I'm not familiar with the process to know which.

> Is it the university itself which will be starved of resources if it's not pumping out novel (yet unreproducible) research?

Researchers apply for grants to fund their research, the university is generally not paying for it and instead they receive a cut of the grant money if it is awarded (IE. The grant covers the costs to the university for providing the facilities to do the research). If a researcher could get funding to reproduce a result then they could absolutely do it, but that's not what funds are usually being handed out for.

soiltype · 12 days ago
Hmm I see. So the grant makers are more of a problem here. And what are their incentives to fund ~bad research?
soiltype commented on GPTZero finds 100 new hallucinations in NeurIPS 2025 accepted papers   gptzero.me/news/neurips/... · Posted by u/segmenta
worik · 18 days ago
> In software it's common to value independent verification - e.g. code review. Someone who is only focused on writing new code instead of careful testing, refactoring, or peer review is widely viewed as a shitty developer by their peers.

That is good practice

It is rare, not common. Managers and funders pay for features

Unreliable insecure software sells very well, so making reliable secure software is a "waste of money", generally

soiltype · 12 days ago
Actually yes you're 100% right, I phrased that badly
soiltype commented on Show HN: One Human + One Agent = One Browser From Scratch in 20K LOC   emsh.cat/one-human-one-ag... · Posted by u/embedding-shape
embedding-shape · 13 days ago
Codex, no idea about tokens, I'll upload the session data probably tomorrow so you could see exactly what was done. I pay ~200 EUR/month for the ChatGPT Pro plan, prorating days I guess it'll be ~19 EUR for three days. Model used for everything was gpt-5.2 with reasoning effort set to xhigh.
soiltype · 13 days ago
Thank you in advance for that! I barely use AI to generate code so I feel pretty lost looking at projects like this.
soiltype commented on GPTZero finds 100 new hallucinations in NeurIPS 2025 accepted papers   gptzero.me/news/neurips/... · Posted by u/segmenta
StableAlkyne · 18 days ago
> I'd love to see future reporting that instead of saying "Research finds amazing chemical x which does y" you see "Researcher reproduces amazing results for chemical x which does y. First discovered by z".

Most people (that I talk to, at least) in science agree that there's a reproducibility crisis. The challenge is there really isn't a good way to incentivize that work.

Fundamentally (unless you're independent wealthy and funding your own work), you have to measure productivity somehow, whether you're at a university, government lab, or the private sector. That turns out to be very hard to do.

If you measure raw number of papers (more common in developing countries and low-tier universities), you incentivize a flood of junk. Some of it is good, but there is such a tidal wave of shit that most people write off your work as a heuristic based on the other people in your cohort.

So, instead it's more common to try to incorporate how "good" a paper is, to reward people with a high quantity of "good" papers. That's quantifying something subjective though, so you might try to use something like citation count as a proxy: if a work is impactful, usually it gets cited a lot. Eventually you may arrive at something like the H-index, which is defined as "The highest number H you can pick, where H is the number of papers you have written with H citations." Now, the trouble with this method is people won't want to "waste" their time on incremental work.

And that's the struggle here; even if we funded and rewarded people for reproducing results, they will always be bumping up the citation count of the original discoverer. But it's worse than that, because literally nobody is going to cite your work. In 10 years, they just see the original paper, a few citing works reproducing it, and to save time they'll just cite the original paper only.

There's clearly a problem with how we incentivize scientific work. And clearly we want to be in a world where people test reproducibility. However, it's very very hard to get there when one's prestige and livelihood is directly tied to discovery rather than reproducibility.

soiltype · 18 days ago
That feels arbitrary as a measure of quality. Why isn't new research simply devalued and replication valued higher?

"Dr Alice failed to reproduce 20 would-be headline-grabbing papers, preventing them from sucking all the air out of the room in cancer research" is something laudable, but we're not lauding it.

soiltype commented on GPTZero finds 100 new hallucinations in NeurIPS 2025 accepted papers   gptzero.me/news/neurips/... · Posted by u/segmenta
goalieca · 18 days ago
Grad students don’t get to publish a thesis on reproduction. Everyone from the undergraduate research assistant to the tenured professor with research chairs are hyper focused on “publishing” as much “positive result” on “novel” work as possible
soiltype · 18 days ago
But that seems almost trivially solved. In software it's common to value independent verification - e.g. code review. Someone who is only focused on writing new code instead of careful testing, refactoring, or peer review is widely viewed as a shitty developer by their peers. Of course there's management to consider and that's where incentives are skewed, but we're talking about a different structure. Why wouldn't the following work?

A single university or even department could make this change - reproduction is the important work, reproduction is what earns a PhD. Or require some split, 20-50% novel work maybe is also expected. Now the incentives are changed. Potentially, this university develops a reputation for reliable research. Others may follow suit.

Presumably, there's a step in this process where money incentivizes the opposite of my suggestion, and I'm not familiar with the process to know which.

Is it the university itself which will be starved of resources if it's not pumping out novel (yet unreproducible) research?

soiltype commented on Waiting for dawn in search: Search index, Google rulings and impact on Kagi   blog.kagi.com/waiting-daw... · Posted by u/josephwegner
the_arun · 19 days ago
If google is serving 90% traffic & others are unable to enter - Doesn't that mean google is doing something right for the customer and others are unable to outcompete it? Isn't this how life works?
soiltype · 19 days ago
...No. Not at all. Not in the case of Google and generally that's not "how life works". If it was true, why would Google spend so much money to be the default search engine in so many devices/browsers?
soiltype commented on Cowork: Claude Code for the rest of your work   claude.com/blog/cowork-re... · Posted by u/adocomplete
Wowfunhappy · a month ago
What has gotten worse without AI? I don't think writing or coding is inherently harder. Google search may be worse but I've heard Kagi is still pretty great. Apple Intelligence feels like it's easy to get rid of on their platforms, for better and worse. If you're using Windows that might get annoying, personally I just use LTSC.
soiltype · a month ago
The skills of writing and coding atrophy when replaced by generative AI. The more we use AI to do thinking in some domain, the less we will be able to do that thinking ourselves. It's not a perfect analogy for car infrastructure.

Yeah Kagi is good, but the web is increasingly dogshit, so if you're searching in a space where you don't already have trusted domains for high quality results, you may just end up being unable to find anything reliable even with a good engine.

soiltype commented on Cowork: Claude Code for the rest of your work   claude.com/blog/cowork-re... · Posted by u/adocomplete
lijok · a month ago
People love their cars, what are you talking about
soiltype · a month ago
No, people hate being trapped without a car in an environment built exclusively to serve cars. Our love of cars is largely just downstream of negative emotions like FOMO or indignation caused by the inability to imagine traveling by any other mode (because on most cases that's not even remotely feasible anymore).
soiltype commented on Cowork: Claude Code for the rest of your work   claude.com/blog/cowork-re... · Posted by u/adocomplete
Workaccount2 · a month ago
Frequency vs. convenience will determine how big of a deal this is in practice.

Cars have plenty of horror stories associated with them, but convenience keeps most people happily driving everyday without a second thought.

Google can quarantine your life with an account ban, but plenty of people still use gmail for everything despite the stories.

So even if Claude cowork can go off the rails and turn your digital life upside down, as long as the stories are just online or "friend of a friend of a friend", people won't care much.

soiltype · a month ago
Considering the ubiquity and necessity of driving cars is overwhelmingly a result of intentional policy choices irrespective of what people wanted or was good for the public interest... actually that's quite a decent analogy for integrated LLM assistants.

People will use AI because other options keep getting worse and because it keeps getting harder to avoid using it. I don't think it's fair to characterize that as convenience though, personally. Like with cars, many people will be well aware of the negative externalities, the risk of harm to themselves, and the lack of personal agency caused by this tool and still use it because avoiding it will become costly to their everyday life.

I think of convenience as something that is a "bonus" on top of normal life typically. Something that becomes mandatory to avoid being left out of society no longer counts.

u/soiltype

KarmaCake day192July 16, 2025View Original