Readit News logoReadit News
WWWWH · 7 days ago
Surely this is gross professional misconduct? If one of my postdocs did this they would be at risk of being fired. I would certainly never trust them again. If I let it get through, I should be at risk.

As a reviewer, if I see the authors lie in this way why should I trust anything else in the paper? The only ethical move is to reject immediately.

I acknowledge mistakes and so on are common but this is different league bad behaviour.

mike_hearn · 6 days ago
What field are you in?

In many fields it's gross professional misconduct only in theory. This sort of thing is very common and there's never any consequence. LLM-generated citations specifically are a new problem but citations of documents that don't support the claim, contradict it, have nothing to do with it or were retracted years ago have been an issue for a long time.

Gwern wrote about this here:

https://gwern.net/leprechaun

"A major source of [false claim] transmission is the frequency with which researchers do not read the papers they cite: because they do not read them, they repeat misstatements or add their own errors, further transforming the leprechaun and adding another link in the chain to anyone seeking the original source. This can be quantified by checking statements against the original paper, and examining the spread of typos in citations: someone reading the original will fix a typo in the usual citation, or is unlikely to make the same typo, and so will not repeat it. Both methods indicate high rates of non-reading"

I first noticed this during COVID and did some blogging about it. In public health it is quite common to do things like present a number with a citation, and then the paper doesn't contain that number anywhere in it, or it does but the number was an arbitrary assumption pulled out of thin air rather than the empirical fact it was being presented as.

It was also very common for papers to open by saying something like, "Epidemiological models are a powerful tool for predicting the spread of disease" with eight different citations, and every single citation would be an unvalidated model - zero evidence that any of the cited models were actually good at prediction.

Bad citations are hardly the worst problem with these fields, but when you see how widespread it is and that nobody within the institutions cares it does lead to the reaction you're having where you just throw your hands up and declare whole fields to be writeoffs.

TomasBM · 6 days ago
The abuse of claims and citations is a legitimate and common problem.

However, I think hallucinated citations pose a bigger problem, because they're fundamentally a lie by commission instead of omission, misinterpretation or misrepresentation of facts.

At the same time, it may be an accidental lie, insofar authors mistakenly used LLMs as search engines, just to support a claim that's commonly known, or that they remember well but can't find the origin of.

So, unless we reduce the pressure on publication speed, and increase the pressure for quality, we'll need to introduce more robust quality checks into peer review.

stainablesteel · 7 days ago
this brings us to a cultural divide, westerners would see this as a personal scar, as they consider the integrity of the publishing sphere at large to be held up by the integrity of individuals

i clicked on 4 of those papers, and the pattern i saw was middle-eastern, indian, and chinese names

these are cultures where they think this kind of behavior is actually acceptable, they would assume it's the fault of the journal for accepting the paper. they don't see the loss of reputation to be a personal scar because they instead attribute blame to the game.

some people would say it's racist to understand this, but in my opinion when i was working with people from these cultures there was just no other way to learn to cooperate with them than to understand them, it's an incredibly confusing experience to be working with them until you understand the various differences between your own culture and theirs

ssivark · 7 days ago
PSA: Please note that the names are hallucinated author lists part of the hallucinated citations, and not names of offending authors.

AFAIK the submissions are still blinded and we don't know who the authors are. We will, surely, soon -- since ICLR maintains all submissions in public record for posterity, even if "withdrawn". They are unblinded after the review period finishes.

ribosometronome · 7 days ago
Where do you see the authors? All I'm seeing is:

>Anonymous authors

>Paper under double-blind review

zsdfgyu · 7 days ago
This sort of behavior is not limited to researchers from those cultures. One of the highest profile academic frauds to date was from a German. Look up the Schön scandal.
throw10920 · 7 days ago
> these are cultures where they think this kind of behavior is actually acceptable, they would assume it's the fault of the journal for accepting the paper. they don't see the loss of reputation to be a personal scar because they instead attribute blame to the game.

I have a relative who lived in a country in the East for several years, and he says that this is just factually true.

The vast majority of people who disagree with this statement have never actually lived in these cultures. They just hallucinate that they have because they want that statement to be false so badly.

...but, simultaneously, I'm also not seeing where you see the authors of the papers - I only see hallucitation authors. e.g. at the link for the first paper submission (https://openreview.net/forum?id=WPgaGP4sVS), there doesn't appear to be any authors listed. Are you confusing the hallucinated citation authors with the primary paper authors?

In that case, I would expect Eastern authors to be over-represented, because they just publish a lot more.

Aeglaecia · 7 days ago
im not sure if you are gonna get downvoted so im sticking a limb out to cop any potential collateral damage in the name of finding out whether the common inhabitant of this forum considers the idea of low trust vs high trust societies to be inherently racist
make3 · 6 days ago
Isn't this mostly a set of citation typos? To me this mostly calls for better bibtex checking, writing and checking bibtex is super annoying
urspx · 6 days ago
Forgetting authors, misspelling them or the journals, putting a wrong digit etc... could be citation typos. I don't see how you add 5 non-existing authors and put a different—but conceptually plausible—journal in the bibtex.

Besides, I would think most people are using bibliographic managers like Zotero&co..., which will pull metadata through DOIs or such.

The errors look a lot more like what happens when you ask an LLM for some sources on xyz.

ulrashida · 7 days ago
Unfortunately while catching false citations is useful, in my experience that's not usually the problem affecting paper quality. Far more prevalent are authors who mis-cite materials, either drawing support from citations that don't actually say those things or strip the nuance away by using cherry picked quotes simply because that is what Google Scholar suggested as a top result.

The time it takes to find these errors is orders of magnitude higher than checking if a citation exists as you need to both read and understand the source material.

These bad actors should be subject to a three strikes rule: the steady corrosion of knowledge is not an accident by these individuals.

hippo22 · 7 days ago
It seems like this is the type of thing that LLMs would actually excel at though: find a list of citations and claims in this paper, do the cited works support the claims?
bryanrasmussen · 7 days ago
sure, except when they hallucinate that the cited works support the claims when they do not. At which point you're back at needing to read the cited works to see if they support the claims.
19f191ty · 7 days ago
Exactly abuse of citations is a much more prevalent and sinister issue and has been for a long time. Fake citations are of course bad but only tip of the iceberg.
seventytwo · 7 days ago
Then punish all of it.
lijenjin · 7 days ago
The linked article at the end says: "First, using Hallucination Check together with GPTZero’s AI Detector allows users to check for AI-generated text and suspicious citations at the same time, and even use one result to verify the other. Second, Hallucination Check greatly reduces the time and labor necessary to verify a document’s sources by identifying flawed citations for a human to review."

On their site (https://gptzero.me/sources) it also says "GPTZero's Hallucination Detector automatically detects hallucinated sources and poorly supported claims in essays. Verify academic integrity with the most accurate hallucination detection tool for educators", so it does more than just identify invalid citations. Seems to do exactly what you're talking about.

potato3732842 · 7 days ago
>These bad actors should be subject to a three strikes rule: the steady corrosion of knowledge is not an accident by these individuals.

These people are working in labs funded by Exxon or Meta or Pfizer or whoever and they know what results will make continued funding worthwhile in the eyes of their donors. If the lab doesn't produce the donor will fund another one that will.

mike_hearn · 6 days ago
No, not really. I've read lots of research papers from commercial firms and academic labs. Bad citations are something I only ever saw in academic papers.

I think that's because a lot of bad citations come from reviewer demands to add more of them during the journal publishing process, so they're not critical to the argument and end up being low effort citations that get copy/pasted between papers. Or someone is just spamming citations to make a weak claim look strong. And all this happens because academic uses citations as a kind of currency (it's a planned non-market economy, so they have to allocate funds using proxy signals).

Commercial labs are less likely to care about the journal process to begin with, and are much less likely to publish weak claims because publishing is just a recruiting tool, not the actual end goal of the R&D department.

Deleted Comment

theoldgreybeard · 7 days ago
If a carpenter builds a crappy shelf “because” his power tools are not calibrated correctly - that’s a crappy carpenter, not a crappy tool.

If a scientist uses an LLM to write a paper with fabricated citations - that’s a crappy scientist.

AI is not the problem, laziness and negligence is. There needs to be serious social consequences to this kind of thing, otherwise we are tacitly endorsing it.

CapitalistCartr · 7 days ago
I'm an industrial electrician. A lot of poor electrical work is visible only to a fellow electrician, and sometimes only another industrial electrician. Bad technical work requires technical inspectors to criticize. Sometimes highly skilled ones.
andy99 · 7 days ago
I’ve reviewed a lot of papers, I don’t consider it the reviewers responsibility to manually verify all citations are real. If there was an unusual citation that was relied on heavily for the basis of the work, one would expect it to be checked. Things like broad prior work, you’d just assume it’s part of background.

The reviewer is not a proofreader, they are checking the rigour and relevance of the work, which does not rest heavily on all of the references in a document. They are also assuming good faith.

barfoure · 7 days ago
I’d love to hear some examples of poor electrical work that you’ve come across that’s often missed or not seen.
xnx · 7 days ago
No doubt the best electricians are currently better than the best AI, but the best AI is likely now better than the novice homeowner. The trajectory over the past 2 years has been very good. Another five years and AI may be better than all but the very best, or most specialized, electricians.
lencastre · 7 days ago
an old boss of mine used to say there are no stupid electricians found alive, as they self select darwin award style
bdangubic · 7 days ago
same (and much, much, much worse) for science
kklisura · 7 days ago
> AI is not the problem, laziness and negligence is

This reminds me about discourse about a gun problem in US, "guns don't kill people, people kill people", etc - it is a discourse used solely for the purpose of not doing anything and not addressing anything about the underlying problem.

So no, you're wrong - AI IS THE PROBLEM.

sneak · 7 days ago
> it is a discourse used solely for the purpose of not doing anything and not addressing anything about the underlying problem

Solely? Oh brother.

In reality it’s the complete opposite. It exists to highlight the actual source of the problem, as both industries/practitioners using AI professionally and safely, and communities with very high rates of gun ownership and exceptionally low rates of gun violence exist.

It isn’t the tools. It’s the social circumstances of the people with access to the tools. That’s the point. The tools are inanimate. You can use them well or use them badly. The existence of the tools does not make humans act badly.

Yoofie · 7 days ago
No, the OP is right in this case. Did you read TFA? It was "peer reviewed".

> Worryingly, each of these submissions has already been reviewed by 3-5 peer experts, most of whom missed the fake citation(s). This failure suggests that some of these papers might have been accepted by ICLR without any intervention. Some had average ratings of 8/10, meaning they would almost certainly have been published.

If the peer reviewers can't be bothered to do the basics, then there is literally no point to peer review, which is fully independent of the author who uses or doesn't use AI tools.

TomatoCo · 7 days ago
To continue the carpenter analogy, the issue with LLMs is that the shelf looks great but is structurally unsound. That it looks good on surface inspection makes it harder to tell that the person making it had no idea what they're doing.
embedding-shape · 7 days ago
Regardless, if a carpenter is not validating their work before selling it, it's the same as if a researcher doesn't validate their citations before publishing. Neither of them have any excuses, and one isn't harder to detect than the other. It's just straight up laziness regardless.
k4rli · 7 days ago
Very good analogy I'd say.

Also similar to what Temu, Wish, and other similar sites offer. Picture and specs might look good but it will likely be disappointing in the end.

SubiculumCode · 7 days ago
Yeah seriously. Using an LLM to help find papers is fine. Then you read them. Then you use a tool like Zotero or manually add citations. I use Gemini Pro to identify useful papers that I might not yet have encountered before. But, even when asking to restrict itself to Pubmed resources, it's citations are wonky, citing three different version sources of the same paper (citations that don't say what they said they'd discuss).

That said, these tools have substantially reduced hallucinations over the last year, and will just get better. It also helps if you can restrict it to reference already screened papers.

Finally, I'd lke to say tthat if we want scientists to engage in good science, stop forcing them to spend a third of their time in a rat race for funding...it is ridiculously time consuming and wasteful of expertise.

bossyTeacher · 7 days ago
The problem isn't whether they have more or less hallucinations. The problem is that they have them. And as long as they hallucinate, you have to deal with that. It doesn't really matter how you prompt, you can't prevent hallucinations from happening and without manual checking, eventually hallucinations will slip under the radar because the only difference between a real pattern and a hallucinated one is that one exists in the world and the other one doesn't. This is not something you can really counter with more LLMs either as it is a problem intrinsic to LLMs
bigstrat2003 · 7 days ago
> If a carpenter builds a crappy shelf “because” his power tools are not calibrated correctly - that’s a crappy carpenter, not a crappy tool.

It's both. The tool is crappy, and the carpenter is crappy for blindly trusting it.

> AI is not the problem, laziness and negligence is.

Similarly, both are a problem here. LLMs are a bad tool, and we should hold people responsible when they blindly trust this bad tool and get bad results.

jodleif · 7 days ago
I find this to be a bit “easy”. There is such a thing as bad tools. If it is difficult to determine if the tool is good or bad i’d say some of the blame has to be put on the tool.
nwallin · 7 days ago
"Anyone, from the most clueless amateur to the best cryptographer, can create an algorithm that he himself can’t break."--Bruce Schneier

There's a corollary here with LLMs, but I'm not pithy enough to phrase it well. Anyone can create something using LLMs that they, themselves, aren't skilled enough to spot the LLMs' hallucinations. Or something.

LLMs are incredibly good at exploiting peoples' confirmation biases. If it "thinks" it knows what you believe/want, it will tell you what you believe/want. There does not exist a way to interface with LLMs that will not ultimately end in the LLM telling you exactly what you want to hear. Using an LLM in your process necessarily results in being told that you're right, even when you're wrong. Using an LLM necessarily results in it reinforcing all of your prior beliefs, regardless of whether those prior beliefs are correct. To an LLM, all hypotheses are true, it's just a matter of hallucinating enough evidence to satisfy the users' skepticism.

I do not believe there exists a way to safely use LLMs in scientific processes. Period. If my belief is true, and ChatGPT has told me it's true, then yes, AI, the tool, is the problem, not the human using the tool.

czl · 6 days ago
> I do not believe there exists a way to safely use LLMs in scientific processes.

What about giving the LLM a narrowly scoped role as a hostile reviewer, while your job is to strengthen the write-up to address any valid objections it raises, plus any hallucinations or confusions it introduces? That’s similar to fuzz testing software to see what breaks or where the reasoning crashes.

Used this way, the model isn’t a source of truth or a decision-maker. It’s a stress test for your argument and your clarity. Obviously it shouldn’t be the only check you do, but it can still be a useful tool in the broader validation process.

rectang · 7 days ago
“X isn’t the problem, people are the problem.” — the age-old cry of industry resisting regulation.
kklisura · 7 days ago
It's not about resisting. It's about undermining any action whatsoever.
theoldgreybeard · 7 days ago
I am not against regulation.

Quite the opposite actually.

codywashere · 7 days ago
what regulation are you advocating for here?
only-one1701 · 7 days ago
Absolutely brutal case of engineering brain here. Real "guns don't kill people, people kill people" stuff.
somehnguy · 7 days ago
Your second statement is correct. What about it makes it “engineering brain”?
theoldgreybeard · 7 days ago
If you were to wager a guess, what do you think my views on gun rights are?
Forgeties79 · 7 days ago
If my calculator gives me the wrong number 20% of the time yeah I should’ve identified the problem, but ideally, that wouldn’t have been sold to me as a functioning calculator in the first place.
imiric · 7 days ago
Indeed. The narrative that this type of issue is entirely the responsibility of the user to fix is insulting, and blame deflection 101.

It's not like these are new issues. They're the same ones we've experienced since the introduction of these tools. And yet the focus has always been to throw more data and compute at the problem, and optimize for fancy benchmarks, instead of addressing these fundamental problems. Worse still, whenever they're brought up users are blamed for "holding it wrong", or for misunderstanding how the tools work. I don't care. An "artificial intelligence" shouldn't be plagued by these issues.

theoldgreybeard · 7 days ago
If it was a well understood property of calculators that they gave incorrect answers randomly then you need to adjust the way you use the tool accordingly.
Hammershaft · 7 days ago
AI dramatically changes the perceived cost/benefit of laziness and negligence, which is leading to much more of it.
grey-area · 7 days ago
Generative AI and the companies selling it with false promises and using it for real work absolutely are the problem.
acituan · 7 days ago
> AI is not the problem, laziness and negligence is.

As much as I agree with you that this is wrong, there is a danger in putting the onus just on the human. Whether due to competition or top down expectations, humans are and will be pressured to use AI tools alongside their work and produce more. Whereas the original idea was for AI to assist the human, as the expected velocity and consumption pressure increases humans are more and more turning into a mere accountability laundering scheme for machine output. When we blame just the human, we are doing exactly what this scheme wants us to do.

Therefore we must also criticize all the systemic factors that puts pressure on reversal of AI‘s assistance into AI’s domination of human activity.

So AI (not as a technology but as a product when shoved down the throats) is the problem.

alexcdot · 7 days ago
Absolutely, expectations and tools given by management are a real problem.

If management fires you because they are wrong about how good AI is, and you're right - at the end of the day, you're fired and the manager is in lalaland.

People need to actually push the correct calibration of what these tools should be trusted to do, while also trying to work with what they have.

photochemsyn · 7 days ago
Yeah, I can't imagine not being familiar with every single reference in the bibliography of a technical publication with one's name on it. It's almost as bad as those PIs who rely on lab techs and postdocs to generate research data using equipment that they don't understand the workings of - but then, I've seen that kind of thing repeatedly in research academia, along with actual fabrication of data in the name of getting another paper out the door, another PhD granted, etc.

Unfortunately, a large fraction of academic fraud has historically been detected by sloppy data duplication, and with LLMs and similar image generation tools, data fabrication has never been easier to do or harder to detect.

b00ty4breakfast · 7 days ago
maybe the hammer factory should be held responsible for pumping out so many poorly calibrated hammer
SauntSolaire · 7 days ago
The obvious solution in this scenario is.. to just buy a different hammer.

And in the case of AI, either review its output, or simply don't use it. No one has a gun to your head forcing you to use this product (and poorly at that).

It's quite telling that, even in this basic hypothetical, your first instinct is to gesture vaguely in the direction of governmental action, rather than expect any agency at the level of the individual.

venturecruelty · 7 days ago
No, because this would cost tens of jobs and affect someone's profits, which are sacrosanct. Obviously the market wants exploding hammers, or else people wouldn't buy them. I am very smart.
venturecruelty · 7 days ago
"It's not a fentanyl problem, it's a people problem."

"It's not a car infrastructure problem, it's a people problem."

"It's not a food safety problem, it's a people problem."

"It's not a lead paint problem, it's a people problem."

"It's not an asbestos problem, it's a people problem."

"It's not a smoking problem, it's a people problem."

SauntSolaire · 7 days ago
What an absurd set of equivalences to make regarding a scientist's relationship to their own work.

If an engineer provided this line of excuse to me, I wouldn't let them anywhere near a product again - a complete abdication of personal and professional responsibility.

stocksinsmocks · 7 days ago
Trades also have self regulation. You can’t sell plumbing services or build houses without any experience or you get in legal trouble. If your workmanship is poor, you can be disciplined by the board even if the tool was at fault. I think fraudulent publications should be taken at least as seriously as badly installed toilets.

Deleted Comment

jval43 · 7 days ago
If a scientist just completely "made up" their references 10 years ago, that's a fraudster. Not just dishonesty but outright academic fraud.

If a scientist does it now, they just blame it on AI. But the consequences should remain the same. This is not an honest mistake.

People that do this - even once - should be banned for life. They put their name on the thing. But just like with plagiarism, falsifying data and academic cheating, somehow a large subset of people thinks it's okay to cheat and lie, and another subset gives them chance after chance to misbehave like they're some kind of children. But these are adults and anyone doing this simply lacks morals and will never improve.

And yes, I've published in academia and I've never cheated or plagiarized in my life. That should not be a drawback.

raincole · 7 days ago
Given we tacitly accepted replication crisis we'll definitely tacitly accept this.
psychoslave · 6 days ago
I don't see much crappy power tool provider throwing billions in marketing and product placement to make them used everywhere.
calmworm · 7 days ago
I don’t understand. You’re saying even with crappy tools one should be able to do the job the same as with well made tools?
tedd4u · 7 days ago
Three and a half years ago nobody had ever used tools like this. It can't be a legitimate complaint for an author to say, "not my fault my citations are fake it's the fault of these tools" because until recently no such tools were available and the expectation was that all citations are real.

Deleted Comment

constantcrying · 7 days ago
Absolutely correct. The real issue is that these people can avoid punishment. If you do not care enough about your paper to even verify the existence of citations, then you obviously should not have a job as a scientist.

Taking an academic who does something like that seriously, seem impossible. At best he is someone who is neglecting his most basic duties as an academic, at worst he is just a fraudster. In both cases he should be shunned and excluded.

DonHopkins · 7 days ago
Shouldn't there be a black list of people who get caught writing fraudulent papers?
theoldgreybeard · 7 days ago
Probably. Something like that is what I meant by “social consequences”. Perhaps there should be civil or criminal ones for more egregious cases.

Dead Comment

nialv7 · 7 days ago
Ah, the "guns don't kill people, people kill people" argument.

I mean sure, but having a tool that made fabrication so much easier has made the problem a lot worse, don't you think?

theoldgreybeard · 7 days ago
Yes I do agree with you that having a tool that gives rocket fuel to a fraud engine should probably be regulated in some fashion.

Tiered licensing, mandatory safety training, and weapon classification by law enforcement works really well for Canada’s gun regime, for example.

left-struck · 7 days ago
It’s like the problem was there all along, all LLMs did was expose it more
theoldgreybeard · 7 days ago
Yes, LLMs didnt create the problem they just accelerated it to a speed that beggars belief.
criley2 · 7 days ago
https://en.wikipedia.org/wiki/Replication_crisis

Modern science is designed from the top to the bottom to produce bad results. The incentives are all mucked up. It's absolutely not surprising that AI is quickly becoming yet-another factor lowering quality.

RossBencina · 7 days ago
No qualified carpenter expects to use a hammer to drill a hole.
foxfired · 7 days ago
I disagree. When the tool promises to do something, you end up trusting it to do the thing.

When Tesla says their car is self driving, people trust them to self drive. Yes, you can blame the user for believing, but that's exactly what they were promised.

> Why didn't the lawyer who used ChatGPT to draft legal briefs verify the case citations before presenting them to a judge? Why are developers raising issues on projects like cURL using LLMs, but not verifying the generated code before pushing a Pull Request? Why are students using AI to write their essays, yet submitting the result without a single read-through? They are all using LLMs as their time-saving strategy. [0]

It's not laziness, its the feature we were promised. We can't keep saying everyone is holding it wrong.

[0]: https://idiallo.com/blog/none-of-us-read-the-specs

rolandog · 7 days ago
Very well put. You're promised Artificial Super Intelligence and shown a super cherry-picked promo and instead get an agent that can't hold its drool and needs constant hand-holding... it can't be both things at the same time, so... which is it?
gdulli · 7 days ago
That's like saying guns aren't the problem, the desire to shoot is the problem. Okay, sure, but wanting something like a metal detector requires us to focus on the more tangible aspect that is the gun.
baxtr · 7 days ago
If I gave you a gun would you start shooting people just because you had one?
jgalt212 · 7 days ago
fair enough, but carpenters are not being beat over the head to use new-fangled probabilistic speed squares.

Deleted Comment

hansmayer · 7 days ago
Scientists who use LLMs to write a paper are crappy scientists indeed. They need to be held accountable, even ostracised by the scientific community. But something is missing from the picture. Why is it that they came up with this idea in the first place? Who could have been peddling the impression (not an outright lie - they are very careful) about LLMs being these almost sentient systems with emergent intelligence, alleviating all of your problems, blah blah blah. Where is the god damn cure for cancer the LLMs were supposed to invent? Who else is it that we need to keep accountable, scrutinised and ostracised for the ever-increasing mountains of AI-crap that is flooding not just the Internet content but now also penetrating into science, every day work, daily lives, conversations, etc. If someone released a tool that enabled and encouraged people to commit suicide in multiple instances that we know of by now, and we know since the infamous "plandemic" facebook trend that the tech bros are more than happy to tolerate worsening societal conditions in the name of their platform growth, who else do we need to keep accountable, scrutinise and ostracise as a society, I wonder?
the8472 · 7 days ago
> Where is the god damn cure for cancer the LLMs were supposed to invent?

Assuming that cure is meant as hyperbole, how about https://www.biorxiv.org/content/10.1101/2025.04.14.648850v3 ? AI models being used for bad purposes doesn't preclude them being used for good purposes.

belter · 7 days ago
"...each of which were missed by 3-5 peer reviewers..."

Its sloppy work all the way down...

Dead Comment

rdiddly · 7 days ago
¿Por qué no los dos?
mk89 · 7 days ago
> we are tacitly endorsing it.

We are, in fact, not tacitly but openly endorsing this, due to this AI everywhere madness. I am so looking forward to when some genius in some banks starts to use it to simplify code and suddenly I have 100000000 € on my bank account. :)

thaumasiotes · 7 days ago
> If a scientist uses an LLM to write a paper with fabricated citations - that’s a crappy scientist.

Really? Regardless of whether it's a good paper?

Aurornis · 7 days ago
Citations are a key part of the paper. If the paper isn’t supported by the citations, it’s not a good paper.
zwnow · 7 days ago
How is it a good paper if the info in it cant be trusted lmao
jameshart · 7 days ago
Is the baseline assumption of this work that an erroneous citation is LLM hallucinated?

Did they run the checker across a body of papers before LLMs were available and verify that there were no citations in peer reviewed papers that got authors or titles wrong?

miniwark · 7 days ago
They explain in the article what they consider a proper citation, an erroneous one and an hallucination, in the section "Defining Hallucitations". They also say than they have many false positives, mostly real papers who are not available online.

Thad said, i am also very curious of the result than their tool, would give to papers from the 2010's and before.

sigmoid10 · 7 days ago
If you look at their examples in the "Defining Hallucitations" section, I'd say those could be 100% human errors. Shortening authors' names, leaving out authors, misattributing authors, misspelling or misremembering the paper title (or having an old preprint-title, as titles do change) are all things that I would fully expect to happen to anyone in any field were things get ever got published. Modern tools have made the citation process more comfortable, but if you go back to the old days, you'd probably find those kinds of errors everywhere. If you look at the full list of "hallucinations" they claim to have discovered, the only ones I'd not immediately blame on human screwups are the ones where a title and the authors got zero matches for existing papers/people. If you really want to do this kind of analysis correctly, you'd have to match the claim of the text and verify it with the cited article. Because I think it would be even more dangerous if you can get claims accepted by simply quoting an existing paper correctly, while completely ignoring its content (which would have worked here).
_alternator_ · 7 days ago
Let me second this: a baseline analysis should include papers that were published or reviewed at least 3-4 years ago.

When I was in grad school, I kept a fairly large .bib file that almost certainly had a mistake or two in it. I don’t think any of them ever made it to print, but it’s hard to be 100% sure.

For most journals, they actually partially check your citations as part of the final editing. The citation record is important for journals, and linking with DOIs is fairly common.

currymj · 7 days ago
the papers themselves are publicly available online too. Most of the ones I spot-checked give the extremely strong impression of AI generation.

not just some hallucinated citations, and not just the writing. in many cases the actual purported research "ideas" seem to be plausible nonsense.

To get a feel for it, you can take some of the topics they write about and ask your favorite LLM to generate a paper. Maybe even throw "Deep Research" mode at it. Perhaps tell it to put it in ICLR latex format. It will look a lot like these.

tokai · 7 days ago
Yeah that is what their tool does.
llm_nerd · 7 days ago
People will commonly hold LLMs as unusable because they make mistakes. So do people. Books have errors. Papers have errors. People have flawed knowledge, often degraded through a conceptual game of telephone.

Exactly as you said, do precisely this to pre-LLM works. There will be an enormous number of errors with utter certainty.

People keep imperfect notes. People are lazy. People sometimes even fabricate. None of this needed LLMs to happen.

pmontra · 7 days ago
Fabricated citations are not errors.

A pre LLM paper with fabricated citations would demonstrate will to cheat by the author.

A post LLM paper with fabricated citations: same thing and if the authors attempt to defend themselves with something like, we trusted the AI, they are sloppy, probably cheaters and not very good at it.

the_af · 7 days ago
LLM are a force multiplier of this kind of errors though. It's not easy to hallucinate papers out of whole cloth, but LLMs can easily and confidently do it, quote paragraphs that don't exist, and do it tirelessly and at a pace unmatched by humans.

Humans can do all of the above but it costs them more, and they do it more slowly. LLMs generate spam at a much faster rate.

add-sub-mul-div · 7 days ago
Quoting myself from just last night because this comes up every time and doesn't always need a new write-up.

> You also don't need gunpowder to kill someone with projectiles, but gunpowder changed things in important ways. All I ever see are the most specious knee-jerk defenses of AI that immediately fall apart.

nkrisc · 7 days ago
Under what circumstances would a human mistakenly cite a paper which does not exist? I’m having difficulty imagining how someone could mistakenly do that.
chistev · 7 days ago
Last month, I was listening to the Joe Rogan Experience episode with guest Avi Loeb, who is a theoretical physicist and professor at Harvard University. He complained about the disturbingly increasing rate at which his students are submitting academic papers referencing non-existent scientific literature that were so clearly hallucinated by Large Language Models (LLMs). They never even bothered to confirm their references and took the AI's output as gospel.

https://www.rxjourney.net/how-artificial-intelligence-ai-is-...

teddyh · 7 days ago
> Avi Loeb, who is a theoretical physicist and professor at Harvard University

Also a frequent proponent of UFO claims about approaching meteors.

chistev · 7 days ago
Yea, he harped on that a lot during the podcast
mannanj · 7 days ago
Isn't this an underlying symptom of lack of accountability of our greater leadership? They do these things, they act like criminals and thieves, and so the people who follow them get shown examples that it's OK while being told to do otherwise.

"Show bad examples then hit you on the wrist for following my behavior" is like bad parenting.

dandanua · 7 days ago
I don't think they want you to follow their behavior. They do want accountability, but for everyone below them, not for themselves.
venturecruelty · 7 days ago
Talk about a buried lead... Avi Loeb is, first and foremost, a discredited crank.
sen · 7 days ago
That’s implied by the fact he was on the Joe Rogan show.
TaupeRanger · 7 days ago
It's going to be even worse than 50:

> Given that we've only scanned 300 out of 20,000 submissions, we estimate that we will find 100s of hallucinated papers in the coming days.

shusaku · 7 days ago
20,000 submissions to a single conference? That is nuts
ghaff · 7 days ago
Doesn't seem especially out of the norm for a large conference. Call it 10,000 attendees which is large but not huge. Sure; not everyone attending puts in a session proposal. But others put multiple. And many submit but, if not accepted don't attend.

Can't quote exact numbers but when I was on the conference committee for a maybe high four figures attendance conference, we certainly had many thousands of submissions.

zipy124 · 7 days ago
When academics are graded based on number of papers this is the result.
analog31 · 7 days ago
This is an interesting article along those lines...

https://www.theguardian.com/technology/2025/dec/06/ai-resear...

thruifgguh585 · 7 days ago
> crushed by an avalanche of submissions fueled by generative AI, paper mills, and publication pressure.

Run of the mill ML jobs these days ask for "papers in NeurIPS ICLR or other Tier-1 conferences".

We're well past Goodhart's law when it comes to publications.

It was already insane in CS - now it's reached asylum levels.

disqard · 7 days ago
You said the quiet part out loud.

Academia has been ripe for disruption for a while now.

The "Rooter" paper came out 20 years ago:

https://www.csail.mit.edu/news/how-fake-paper-generator-tric...

Isamu · 7 days ago
Someone commented here that hallucination is what LLMs do, it’s the designed mode of selecting statistically relevant model data that was built on the training set and then mashing it up for an output. The outcome is something that statistically resembles a real citation.

Creating a real citation is totally doable by a machine though, it is just selecting relevant text, looking up the title, authors, pages etc and putting that in canonical form. It’s just that LLMs are not currently doing the work we ask for, but instead something similar in form that may be good enough.

make3 · 6 days ago
This interpretation would have been ok for old generation models without search tools enabled and without reliable tool use and reasoning. Modern LLMs can actually look up the existence of papers with web search, and with reasoning, one can definitely get reasonable results by requiring the model to double check that everything actually exists.

Deleted Comment