Readit News logoReadit News
stdbrouw commented on Over fifty new hallucinations in ICLR 2026 submissions   gptzero.me/news/iclr-2026... · Posted by u/puttycat
semi-extrinsic · 8 days ago
Yeah, there is an interesting question there (always has been). When do you stop citing the paper for a specific model?

Just to take some examples, is BiCGStab famous enough now that we can stop citing van der Vorst? Is the AdS/CFT correspondence well known enough that we can stop citing Maldacena? Are transformers so ubiquitous that we don't have to cite "Attention is all you need" anymore? I would be closer to yes than no on these, but it's not 100% clear-cut.

One obvious criterion has to be "if you leave out the citation, will it be obvious to the reader what you've done/used"? Another metric is approximately "did the original author get enough credit already"?

stdbrouw · 8 days ago
Yeah, I didn't want to be contrary just for the sake of it, the heuristics you mention seem like good ones, and if followed would probably already cut down on quite a few superfluous references in most papers.
stdbrouw commented on Over fifty new hallucinations in ICLR 2026 submissions   gptzero.me/news/iclr-2026... · Posted by u/puttycat
semi-extrinsic · 9 days ago
It's also a consequence of the sheer number of building blocks which are involved in modern science.

In the methods section, it's very common to say "We employ method barfoo [1] as implemented in library libbar [2], with the specific variant widget due to Smith et al. [3] and the gobbledygook renormalization [4,5]. The feoozbar is solved with geometric multigrid [6]. Data is analyzed using the froiznok method [7] from the boolbool library [8]." There goes 8, now you have 2 citations left for the introduction.

stdbrouw · 8 days ago
Do you still feel the same way if the froiznok method is an ANOVA table of a linear regression, with a log-transformed outcome? Should I reference Fisher, Galton, Newton, the first person to log transform an outcome in a regression analysis, the first person to log transform the particular outcome used in your paper, the R developers, and Gauss and Markov for showing that under certain conditions OLS is the best linear unbiased estimator? And then a couple of references about the importance of quantitative analysis in general? Because that is the level of detail I’m seeing :-)
stdbrouw commented on Over fifty new hallucinations in ICLR 2026 submissions   gptzero.me/news/iclr-2026... · Posted by u/puttycat
andy99 · 9 days ago
I’ve reviewed a lot of papers, I don’t consider it the reviewers responsibility to manually verify all citations are real. If there was an unusual citation that was relied on heavily for the basis of the work, one would expect it to be checked. Things like broad prior work, you’d just assume it’s part of background.

The reviewer is not a proofreader, they are checking the rigour and relevance of the work, which does not rest heavily on all of the references in a document. They are also assuming good faith.

stdbrouw · 9 days ago
The idea that references in a scientific paper should be plentiful but aren't really that important, is a consequence of a previous technological revolution: the internet.

You'll find a lot of papers from, say, the '70s, with a grand total of maybe 10 references, all of them to crucial prior work, and if those references don't say what the author claims they should say (e.g. that the particular method that is employed is valid), then chances are that the current paper is weaker than it seems, or even invalid, and so it is extremely important to check those references.

Then the internet came along, scientists started padding their work with easily found but barely relevant references and journal editors started requiring that even "the earth is round" should be well-referenced. The result is that peer reviewers feel that asking them to check the references is akin to asking them to do a spell check. Fair enough, I agree, I usually can't be bothered to do many or any citation checks when I am asked to do peer review, but it's good to remember that this in itself is an indication of a perverted system, which we just all ignored -- at our peril -- until LLM hallucinations upset the status quo.

stdbrouw commented on Python Data Science Handbook   jakevdp.github.io/PythonD... · Posted by u/cl3misch
this_user · 13 days ago
Pandas is widely adopted and deeply integrated into the Python ecosystem. Meanwhile, Polars remains a small niche, and it's one of those hype technologies that will likely be dead in 3 years once most of its users realise that it offers them no actual practical advantages over Pandas.

If you are dealing with huge data sets, you are probably using Spark or something like Dask already where jobs can run in the cloud. If you need speed and efficiency on your local machine, you use NumPy outright. And if you really, really need speed, you rewrite it in C/C++.

Polars is trying to solve an issue that just doesn't exist for the vast majority of users.

stdbrouw · 13 days ago
Arguably Spark solves a problem that does not exist anymore: single node performance with tools like DuckDB and Polars is so good that there’s no need for more complex orchestration anymore, and these tools are sufficiently user-friendly that there is little point to switching to Pandas for smaller datasets.
stdbrouw commented on What if hard work felt easier?   jeanhsu.substack.com/p/wh... · Posted by u/kiyanwang
thewebguyd · a month ago
> The rules are too complex to start a business.

Not to mention the taxes, depending on where you live. When I actually went to register my side gig wedding photography business it was the most confusing mess I've ever had to slog through.

It's not clear, at all, what licenses you need because of the nature of the business. There's state, then there's also local/city licenses, and it's not clear on if you need one for every city you photograph at, or just one for the city you reside in (your business address). Some cities require it some don't.

Then the taxes here are also just as confusing. Different rates depending on the business activity. Session fee revenue is taxed differently than digital photo sales revenue, etc. Then sometimes the service itself is subject to sales tax, sometimes not, depending on how (and where) you deliver the photos matters too.

You can't be on the legal up and up without hiring an accountant and maybe an attorney to go through the process with you, which is definitely not something I wanted to do for a side gig.

It should not be this convoluted or difficult to legally open a business, especially under a certain amount of revenue per year. I'm not making millions here, we're talking less than 100k/year in gross revenue.

There's a reason most photographers here just....don't bother to register with the dept of revenue. Most don't get caught anyway so a lot of times its worth the risk to just..not pay the taxes.

None of that bureaucracy should exist for businesses under a certain size/under a certain revenue.

stdbrouw · a month ago
In the US income from a hobby can just be added to your personal filing [1] and in Belgium, where I live, there is a similar arrangement for "diverse sources of income" [2]. If you do start a business, in the European Union you're exempt from filing VAT if your yearly revenue is below a certain amount [3]. Europe has also been pretty aggressive in getting rid of licensing requirements for various occupations and trades, certainly a photographer wouldn't need a license here.

I think the trouble you faced, resulted from being at the edge of these kinds of simple systems that do exist -- big enough to need to set up a business, but small enough that hiring an accountant or spending time to familiarize yourself with the legal requirements was out of proportion to the expected revenue. That's unfortunate, of course, but doesn't necessarily reflect on the amount of red tape that exists in general in a country.

[1] https://www.irs.gov/newsroom/heres-how-to-tell-the-differenc...

[2] https://www.vlaanderen.be/economie-en-ondernemen/een-eigen-z...

[3] https://europa.eu/youreurope/business/taxation/vat/vat-exemp...

stdbrouw commented on Public Montessori programs strengthen learning outcomes at lower costs: study   phys.org/news/2025-10-nat... · Posted by u/strict9
rahimnathwani · 2 months ago
That's taken care of in the study design. The population was all kids who applied to the lottery. And the treatment group wasn't those who actually attended the Montessori school, but those who were offered a place due to the lottery.

So I don't see how special needs would bias the results. If the lottery excludes those with special needs (either by design or due to self-selection) then there's no bias between control group and treatment group. If the lottery doesn't exclude but the enrollment decision is biased by special needs, then it doesn't matter because they use ITT and not enrolment.

stdbrouw · 2 months ago
Yeah, the intention to treat design is a particularly nice touch, not so common outside of biostatistics. They also compare the full cost of Montessori vs. plain ol', not the cost to the state, which could otherwise have given the Montessori schools (which are in wealthier neighborhoods on average) an unfair advantage if they have a lot of parents chipping in with donations and help. I've skimmed through the methods section and it does seem like they've gone to great lengths to allow for a fair comparison.

That doesn't necessarily mean the result will extrapolate, though. It seems plausible that teachers in Montessori schools are more motivated and knowledgeable than the average teacher and have made a conscious decision to teach in such a school. If every public school were to become a Montessori school, you would still get the cost savings (student-to-teacher ratios are higher in Montessori!) but you might lose that above-average enthusiasm and expertise and so the learning gains might not carry over. It's just really hard to know whether something might generalize in the educational sciences.

stdbrouw commented on Take something you don’t like and try to like it   dynomight.net/liking/... · Posted by u/surprisetalk
netbioserror · 3 months ago
You've never discovered something and felt the indescribable, joy-inducing draw of its appeal? Listened to some music and immediately jived with it? Blaming familiarity bias and "old man yells at clouds" is a disappointingly small-minded critique. It's the opposite: I've lived long enough to thoroughly experience the joy of newly discovering something that feels like it fits me perfectly; conversely, I've tried to appreciate other things enough times to know never to waste my time trying again. Especially for the sake of others.

I've learned that liking things behaves a lot like attraction. It has no reasoning or logic, it happens organically, and when you know, you know. Thus, I would never deign to pretend to like something I've found I don't.

stdbrouw · 3 months ago
... but the key thing is that I am not saying that you are an old man who yells at the clouds, rather that a lot of people worry about themselves that they might be getting unduly close-minded and that this is what the blog post is trying to address.

Your mileage seems to vary, but I find that for food and drinks in particular it's the acquired tastes you get the most enjoyment from in the end -- I haven't met many people who enjoyed their first glass of peated whisky, for example. Heck, even my best friends are definitely an "acquired taste", as is obvious to me when I introduce them to other people I know.

stdbrouw commented on Take something you don’t like and try to like it   dynomight.net/liking/... · Posted by u/surprisetalk
netbioserror · 3 months ago
Actively trying to like something is already a sign that you don't like it intrinsically. Continuing to try strikes me as...some expression of over-socialization. It's okay to pursue things you actually like for your own sake.
stdbrouw · 3 months ago
If there is an ideal amount of some personality trait then for most people the advice would be "do more of this" even though for some others it'll have to be "do less of this" depending on where you're coming from. When I was young I definitely did a bit of over-socialization (everybody seems to like music festivals so I guess I must like them too if I don't want to be a weirdo?) but as you can see in the comments to this post, as we get older it's easy to get into a pattern where anything you're not familiar with is instantly met with suspicion or derision, and a lot of people don't like this about themselves, which is why this blogpost resonates with them.

Also, "liking something intrinsically", what does that even mean?

stdbrouw commented on UN report finds UN reports are not widely read   reuters.com/world/un-repo... · Posted by u/anjneymidha
roughly · 4 months ago
The point of a report is to provide a structured process and background for a set of technical or policy recommendations. It’d be perfectly normal for a report drafted by the efforts of 50 people to have an audience of 2-3 major decision makers - the point is the process for generating the recommendations. Further, it’d also be quite normal for a report on a specific topic to be used as an input to another process which generates its own outputs, meaning there’s little reason for people not involved in the latter process to read the original report unless they’re deeply interrogating the findings of the consolidated report.
stdbrouw · 4 months ago
Even so. I read a lot of reports about educational policy (and occasionally produce them) and even if there are only 2-3 major decision makers you'd expect the report to be read by various cabinet members of those decision makers, by committee members in parliament, by academics, by other teams or colleagues or institutions that would have liked to write the report in your stead or that produce "competing" reports, by folks at think tanks, and by journalists and politicians in general. Because the executive summary is almost always inlined in these kinds of reports, the intended audience is generally quite broad. I'm not saying that attaining only a couple hundred downloads of a report necessarily show that money was wasted on superfluous research, but it definitely can be an indicator of waste.

I think this is one of those things where you can really overthink it and convince yourself that "the report was read only once, by the one person who had to read it" is an ideal outcome, but really it isn't.

stdbrouw commented on Self-taught engineers often outperform (2024)   michaelbastos.com/blog/wh... · Posted by u/mbastos
hirvi74 · 5 months ago
I have a degree in computer science, and I have been a professional full-stack developer for almost 9 years. I could probably not solve the problem in your first link in 25 minutes. I know exactly how I would solve it, conceptually and what data structures I would use, but I am not sure I'd get the adjacent mines part correct on the first attempt.

Then again, this is nothing like the type of problems I work on a daily basis.

stdbrouw · 5 months ago
> Then again, this is nothing like the type of problems I work on a daily basis.

I thought it'd be fun to take a stab at it in Python, which I haven't used in a while, but only barely still remembered that I could accept command-line input with the built in `input` function, something I don't think I ever used after writing my first lines of code 15 years ago. Then I figured I'd use just lists instead of a numpy array but had forgotten that [[0] * 4] * 4 would just create 4 references to the same list. And that pretty much derailed the whole thing for me, even though I was sure I'd get it done in 25 minutes or under :-)

u/stdbrouw

KarmaCake day4822December 15, 2010
About
Developer and data scientist in the news industry. I blog over at http://debrouwere.org

[ my public key: https://keybase.io/debrouwere; my proof: https://keybase.io/debrouwere/sigs/Cnr9GaJGcojfjdLSHpSlDuKmoGDIspELZlN-7yq1pVY ]

View Original