This is a bit surprising, but hardly the biggest problem. I’ve posted before about it, but academic has to get away from the infantilism of using KPIs to judge creative intellectual work.
Part of the problem is that people seem to want an objective evaluation of a piece of research, and measure the value for taxpayer money. Well, you can’t get an objective evaluation. It’s all subjective. And “impact” as a quantifiable entity in general is nonsense, for one thing the timescales prohibit its measurement at the point where it may be useful.
The solution is to use management. Lots of people object here and say “but nepotism, favouritism” and yep that’s a problem, but it is less of a problem that the decline of western universities. You can circumvent it somewhat by rotation, by involving external figures, by a hierarchy that ensures people are not colluding, but ultimately you just have to trust people and accept some wastage.
People aren’t in academia for the money. It’s a vocation. You’re not going to have many people milking the system. Things went pretty well before the metric culture invaded the academy. They can go well again.
> People aren’t in academia for the money. It’s a vocation. You’re not going to have many people milking the system.
Before introducing the KPIs, a majority of polish science was basically people milking the system and doing barely any (valueable) research. It was seen as an easy, safe and ok paying job where the only major hassle is having to teach the students. You often needed connections to get in. It was partially like that because of the communist legacy, where playing ball with the communist party was the most important merit for promotion, which, over the course of 45 years (the span of communism in Poland), filled the academia management ranks with conformist mediocrities.
Now, after a series of major reforms, there's a ton of KPIs, and people are now doing plenty of makework research to collect the required points, but still little valueable work gets done. Also, people interested in doing genuine science who would be doing it under the old system are now discouraged from joining academia, because in the system they're expected to game the points system and not to do real work.
What is the lesson from this is? Creating institutionalized science is hard? It requires a long tradition and scientific cultural standards and can't be just wished into place by bureaucrats? Also, perhaps it's good to be doing the science for some purpose, which in the US case are often DoD grants, where the military expects some practical application. This application may be extremely distant, vague and uncertain (they fund pure math research!), but still, they're the client and they expect results. Whereas the (unstated) goal of science in Poland seems to be just to increase the prestige of Polish science and its Universities by getting papers into prestigious journals, whereas the actual science being done doesn't matter at all - basically state-level navel gazing.
This is a good cautionary lesson. Problems like collusion rings in computer science are more serious than they appear. If left unchecked for a decade or two, the cheaters will rise to the top and take the positions of power in the field, and then it's going to be almost impossible to fix the system.
Relying on altruistic tendencies for people in academia is not adequate. Everyone starts out in academics as school children, and get filtered out or filter themselves out pursuing other things. Those who remain will be the ones who love to learn and teach, those who just cannot accept loss/failure, and, sadly, those who are afraid of change. The more competitive the field becomes, the harder it is to succeed, the more we select for the hyper-competitive or fearful over the altruistic.
I've long wondered if the answer for this and other issues like tax evasion wouldn't be to often change the metrics.
If you start cheating the metrics, or optimizing a lot towards them, it becomes counter-productive when they change. As such, the most efficient way forward would be to work without trying to optimize for a temporary metric.
On the flip side, it would be troublesome to convince people to different, complex forms each time.
What first gave me the idea was the concept of "lubricating" headers (submitting some with random values) for future http protocols, to combat "ossification", where middle boxes start to meddle with them and become obsolete when they don't recognize the new fields, instead of transmitting them.
I recognize the symptoms you describe, but I find the picture you painted a bit too pessimistic, given that in the field that I'm familiar with (theoretical CS), groups in Poland have been among the strongest and most visible in Europe in the past 10 years.
Wow, I couldn't disagree more. I can assure you that if you mean by "management" subjective evaluation by local administrative staff or "higher ups" like full professors, then it's the worst system you could possibly have. It invariably leads to corruption, favouritism, and brown nosing. Funding authorities have been fighting these rotten structures for the past 20 years where I work (Southern Europe), but they cannot do too much because the university system is not under their control, only the funding for national research projects.
There is only one reasonable way to evaluate research and researchers, that's to evaluate the content of their work and publications by external evaluation panels and tell these panels explicitly that they should not base their assessments solely on indicator counting, but on the overall merits, originality, and prospects of the research according to their subjective opinion. Metrics shouldn't even be used as a tie-breaker, they should only ever be used as weak indicators, and this must be explicitly mentioned in the guidelines.
In addition, you need a few other guidelines and laws. For example, it must be prohibited that someone becomes a postdoc at the same place where they obtained their Ph.D. We have people who study at university X, do their exams at university X, do their Ph.D. at university X under the same professors they always knew, then become postdocs working for their former supervisors and being exploited by them (teaching the same bad seminar their former supervisor teaches the past 20 years), and then get tenure in a rigged call. And the worst thing about it is that they feel entitled to all of this.
You've got to break this vicious cycle, but with your suggestion of using a methodology that worked in 1950s (with an order of magnitude less candidates) this could never be achieved.
I couldn't agree more with the former and less with the latter.
I think external evaluation panels (and in particular, from a third country) are the way to go. We already have good examples, for example, in ERC grant panels. The ERC uses exactly the strategy you mention and it has an impeccable reputation, I know many people who applied with or without success but I know no one who felt treated unfairly.
But I'm against blanket rules prohibiting postdocs or positions at the same place of the PhD, at least from my Southern European (Spanish) point of view. This is often touted in Southern European countries because in the US no one does that so it must be evil, so clearly we should ban it to be more like Americans and see if this makes our scientific productivity closer to theirs. But European (especially Southern European) culture is not US culture. People want to settle down, be near their loved ones, and there's nothing intrinsically bad about that that should be punished. Plus, the job market is much less dynamic so even for those who don't mind bumping around, it can be hard to reconcile with a possible spouse who has a career too. And finally, if you push people in this region to move, most of the time the result will be that they end up in a Northern European country (or the US, Canada, etc.) where they make 3x or 4x more, and never come back - once you have experienced a much better salary it's hard to justify returning, I have seen it plenty of times.
Bring on the external evaluation panels, and then there will be no need for any measure forcing people to move, which would reduce inclusivity and thus the talent pool.
But optimizing for originality is one of the reasons of the current mess so why is this an appropriate metric? Reproducing research is very valuable and a way to combat fake results.
But many less well off EU countries have only 1 good Uni and many people don't like moving away. That would mean forcing people to move out of their homeland just for tenure. I'd rather we stop having lifetime professors and limit the tenure to 10-15 years
> We have people who study at university X, do their exams at university X, do their Ph.D. at university X under the same professors they always knew, then become postdocs working for their former supervisors and being exploited by them
Independent of every other point, this is in my point of view a real problem. It happens so often and I don't understand why there's often no law that at least a postdoc at another institution has to be done.
I'd say the problem is there is simply too many people wanting to get a position compared to how many open positions there are. In my college PhD's are practically handed out in return for working on some projects which generate revenue for the Uni + doing the teaching assistant work. After that everyone goes back to the industry because it's practically impossible to get a professor title. And no one ever loses their job so there are more professors barely doing anything than those investing their life into it. They should make it so that you can only be a professor for max 15 years then you have to go. Use that expertise in the industry and let someone else try being a professor.
Not sure what you mean by that, but KPIs are generally put in place by high-level management. Or do you want more micromanagement?
Either way, I think the solution is not more control, quite the opposite. I think the solution is just to remove the extrinsic incentives.
Some people say UBI will cause people to do nothing and that is probably true, but the flip side is that the output of the remaining people will likely be many times higher both in volume and quality (with the total volume much lower but higher quality). Not having their energy completely destroyed by all the busy work necessary to show they are working.
The KPIs are there because OFFICIALLY (and this is strictly so) you are not even allowed to get a tenured position without an absurd number of (in Maths) JCR papers IN THE FIRST QUARTILE.
This is so stupid it is not even funny but how can you fight that when your PhD students depend on those metrics?
With respect, you're saying metrics are used because they are used. Officially or semi-officially, doesn't matter. There needs to be a collective and individual process of disowning the metrics.
Personally I think we need to do away with tenure. Academic institutions should just hire good researchers as employees and do good research.
There was a time in history when tenure made sense, but today the tenure track process forces a lot of people to go after low-hanging fruit that has a high probability of being accepted for publication, instead of trying things that are meaningful to try but may fail.
> Things went pretty well before the metric culture invaded the academy. They can go well again.
That won't happen, mostly because employers have outsourced education and vetting (in the form of requiring bachelor/master degrees) to universities (and the associated costs to governments and/or students who pay tuitions) instead of the old style vocational training/apprenticeship system where the employers had to pay.
Want to restore academia to only those actually interested in science? Make employers pay a substantial tax on jobs requiring academic degrees.
The issue is everyone's subjective measure varies by a lot. The NIPS experiment showed that the papers selected for publication by peer review is more on the random side. Now assume you got this much randomness but on your career progression. And it would be even worse as in the peer review system works on more niche topics then university panel judging the work could be so their review will be even more subjective and changing with time.
In short, I think it is definitely clear neither the citation nor the journal brand is the best proxy for the worthiness but the system you are proposing is worse while still reliant on subjective judgement.
The problem is not about judging creative intellectual work. We already know how to do that, and the academia usually manages to do it just fine.
It's not really about judging people either. When it comes to choosing which people to reward with jobs, promotions, grants, and prizes, we already know the solution: expert panels that spend nontrivial time with each application. Sometimes there are political or administrative reasons that override academic excellence, but in general the academia has figured out how to evaluate shortlisted people.
The real problem is shortlisting people. For example, when a reputable university has an open faculty position, it typically gets tens of great applications. Because the people evaluating the applications are busy professors, they need a way of throwing away most of the applications with minimal effort. From the applicant's perspective, this means you only get a chance if the first impression makes you stand out among other great applicants. And that's why it matters that you publish in prestigious conferences/journals, come from a prestigious university, and have prestigious PhD/postdoc supervisors.
> academic has to get away from the infantilism of using KPIs
Well, science doesn't care that much for KPIs, per se. It's more that the managers want numbers to steer by.
In academia, getting promoted means more management tasks. So higher up academics have been indoctrinated to want numbers. Is the scourge of management.
I agree overall with what you wrote, but have to comment here, because I think this is already not the case in many settings. I can only speak for the US, but in my experience with some other places overseas similar issues are developing.
There are many legitimate hypotheses for why this is the case, but in general at many universities, as far as climbing the academic ladder is concerned, publication metrics are no longer relevant. That is, some baseline is required, but beyond that, most of the focus is on money and grant sizes. I've been in promotion meetings discussing junior faculty that are not publishing and this is brushed aside because they have large grants. I've also repeatedly heard sentiments to the effect of "papers are a dime a dozen, grant dollars distinguish good from bad research."
Again, there's lots of reasonable opinions about this, but I've come to a place where I've decided this is incentivizing corruption. Good research is only weakly correlated with its grant reimbursement, and regardless, it's lead to a focus on something only weakly associated with research quality. Discussions with university staff where you're openly pressured to artificially inflate costs to bring in more indirect funds should raise questions. Just as it's apparent that incentivizing (relatively) superficial bibliometric indices like publication count or h factors leads to superficial science, incentivizing research through grant money has the same effect, but differently.
So yes, going into academics is not the way to make money if that's what you want. However, I think nowadays in the US, it's very much all about the money for large segments, who are milking the system right now at this moment.
Also, in theory, yes, management is the solution, but really management is how we've gotten into this mess. Good management, yes, bad management no. But how do you insure the former?
Fixing this mess academics has slid down (in my perception, maybe everything really is fine) will require a lot of changes that will be controversial and painful to many, and I don't think there's a single magic bullet cure. Eliminating indirect funds is probably one thing, funding research through different mechanisms is another, maybe lotteries, probably opening up grant review processes to the general public. Maybe dissociating scientific research from the university system even more so than has been the case is also necessary. Maybe incentivizing a change in university administration structures. Probably all of the above, plus a lot else.
How to get things to go well again is achievable in theory but how to get there is less clear given the amount of change involved.
One way forward could be to lower the bar for publications.
Once it's no longer about being in the esteemed and scarce "10%", they won't bother because they don't need to. Imagine a process where the only criteria are technical soundness and novelty, and as long as minimal standards are met, it's a "go". Call it the "ArXiv + quality check" model.
Neither formal acceptance to publish nor citation numbers truly mark scientific excellence; perhaps, winning a "test of time award" does, or appearing in a text book 10 years later.
I've been reviewing occasionally since ~1995, regularly since ~2004, and I've never heard of collusion rings happening in my sub-area of CS (ML, IR, NLP). I have caught people submitting to multiple conferences without disclosing it. Ignoring past work that is relevant is common, more often our of blissful ignorance, and occasionally likely with full intent. I'm not saying I doubt the report, but I suspect the bigger problem that CS has is a large percentage of poor-quality work that couldn't be replicated.
BTW, the most blantant thing I've heard of (from a contact complaining about it on LinkedIn) is someone had their very own core paper from their PhD thesis plagiarised - submitted to another conference (again) but with different author names on it... and they even cited the real author's PhD thesis!
You can't help but be entertained by the amount of plagiarism that exists. A while back I found out that one of my papers [1] was being sold as a final year thesis via ..a video ad on YouTube [2].
What you propose is actually how it used to function until around the mid-20th century. Journals used to be very permissive when accepting papers with the editor only doing a cursory check to make sure the paper wasn't total garbage.
That’s how it used to be. The reason behind the shift is the rapid increase in the amount of research output in the past decades. The reason that “arXiv + quality check” won’t work is that the amount of research funding and permanent positions (related) did not increase accordingly. (Think of all the people producing science for low wages as Phds and postdocs and then quitting.) Right now the burden of the (unfair?) selection is mostly on the publishers and their prestigious journals/conferences, which then funding agencies/institutions take into account when funding research/hiring people. If we switch “arXiv + quality check”, that burden will just move to the funding agencies/institutions, but it won’t solve the underlying fundamental problem.
In the US, both academic hiring and proposal reviews are already laborious jobs where committee members are supposed to investigate the candidate/proposal deeply on their merits and potential. It shouldn't be a new burden for them to be unable to take shortcuts.
I think the recent wave of low-impact submissions and co-authorship rings is the result of developing countries trying to simplify that process and tying hiring/promotion/pay directly to publication count and related easy metrics.
> Imagine a process where the only criteria are technical soundness and novelty, and as long as minimal standards are met, it's a "go". Call it the "ArXiv + quality check" model.
One possible issue is that researchers usually need to justify their research to somebody who's not in their field. Conferences are one way to do this. So are citation counts. Both are highly imperfect, but outsiders typically want some signal that doesn't require being an expert in a person's chosen field. The "Arxiv + quality check" model doesn't seem to provide this.
> I suspect the bigger problem that CS has is a large percentage of poor-quality work that couldn't be replicated.
As a sort of ML researcher for several years, I agree.
As a fellow ML researcher, I want to add that the lack of code along with the publication makes the problem worse.$BIGGROUP gets a paper whose core contribution is a library, published, and yet they haven't released the code 6 months after the conference, effectively claiming credits for something unverifiable.
Even novelty is a dubious standard that is often used to discard replications and meta-analyses (not novel), empirical work (no novel theory), reformulations and simplifications of existing research (not novel) and null findings (nothing was found so not novel).
Well, then you need something else to use as a basis for tenure decisions.
Also, a "best 10%" conference/journal is valuable -- I have only so much time in the day. There are a few conferences for which it is always a good use of my time to read all the abstracts, plus one or two of the papers that seem most interesting. I can't do that for every conference, or even most conferences, in my area.
So the "best 10%" conference/journal is valuable to the consumers, and the prestiege is valuable to the producers. Therefore I think such a thing would simply re-emerge if you somehow killed it.
I mean there are several journals who publish based only on "technical soundness", Nature's Scientific Reports is probably the most well known one. I don't think it helps, in fact IMO it's detrimental because journals would be even more flooded with publications.
Also in the article they talk about the conference getting 10,000 submissions, that quickly becomes unmanageable as a conference to accept.
In deep learning a lot of papers end up only on arXiv, and authors don't bother (or need) to send it anywhere. Or even more, some seminal papers and up only on the author's page (e.g. one introducing GPT-2 https://openai.com/blog/better-language-models/). I am not a big fan of the latter, as it gives problems with long-term availability.
Of course, only corporate researchers can rely on not publishing in established journals - as their salary and position does not come from "publish or perish" metrics. Ironically, it means that there is more academic freedom in private companies than in academia.
Bell Labs in its heyday was still reputed to be publish or perish, or patent or perish, without any tenure system either. At my university, we had a previous Bell Labs researcher that said he had been pushed out. As a corporate researcher, there is still high pressure to output something.
At it's best, that's what scientific reports is (in a number of subfields)--a place to put research that is reviewed for technical correctness. There may be debate about how well it succeeds, but I think it's useful. I would prefer to have more papers out where people worked on something, found it wasn't necessarily exciting, but it was solid work and it saves other people time.
It has long seemed like there was soft collusion in academics anyway. There is plenty of complaining and suspicion about how bad and rigged reviews can sometimes seem to be. There are plenty of rumors of PIs who influence what gets published through back-channel alleys. But even in the absence of outright nefarious activity, there's a reinforcement cycle where bigger labs influence what papers get in, receive more grant money, and get better about influencing what papers get in. Even gently guiding what topics are acceptable, in the long term shuts out newcomers. Just about any graduate student by the time they graduate has reviewed papers in a "double blind" system where they could reliably identify the authors, there are always tell-tale markers and styles. It's really hard to find true anonymity.
I've been thinking on an off about review system that might improve on things. I'm imagining perhaps: reviewers and authors both get to see who they are, and conflicts of interest can be called out by other people after review and before publication; reviewers are chosen at random, not allowed to bid; reviewers are sent a series of pairs of papers, and asked to choose which one they'd rather see, scores and ultimately publication can be decided by rank choice vote rather than reviewer assignment; comments on paper improvement would be completely optional. Would this be better or worse than what we have? Would it deter explicit collusion?
You can’t be as candid in your criticisms if the identity of the reviewers is made available to the authors. This is the main reason they’re made anonymous. If you think there can be a conflict of interest, the main mechanism in most fields is to point who shouldn’t be a reviewer for your paper. If you forget someone and they are asked, it’s also their responsibility to recuse themselves. Of course it’s based on trust. Trust is still the foundation of academia.
I’d agree with all those thoughts with respect to the system we have today. Yes, the goal of de-anonymizing reviews would be to eliminate trust in reviewers as a requirement for achieving impartial and fair reviewing (given the many ways we’ve already seen trust failure), and hopefully the side effect would be to increase overall trust in the system. Making reviewers part of the public record would also allow exposure of bad behavior and explicit collusion, without having to go to great lengths to prove it like they did in the article. I’m not entirely sure that the level of candidness that anonymity invites is necessary or helpful for a healthy and robust review system. But, note that my suggestion above doesn’t depend on critical feedback at all. One of the things I’ve found problematic in today’s review system is that reviewers assign absolute scores to a paper, but every reviewer has their own notion of what is good and bad. It’s common for an entire group of reviewers in one sub-topic to give average lower scores than a group in another sub-topic, meaning that different sections of a given journal or conference have different acceptance rates. My suggestion for rank choice voting is partly to ensure that all reviewers are working in the same units, and are all weighted equally. (But I assume there is a possibility of unintended consequences and/or opportunities to game the system with what I suggested, so I’m curious if people see ways that might happen.)
The part I found interesting was that the author notes that it is well-known that the quality of the work has little to do with its odds of acceptance even when the system is 'functioning' as designed. Is it cheating to game a system that is already fundamentally broken? If the quality of your work is inadequate to secure publication and your career depends on it, it wouldn't be very hard to convince yourself that you can't cheat a rigged game. What 'integrity' is actually being threatened? Perhaps some of the energy devoted to identifying these collusion rings would be better spent developing a review process that is at least somewhat biased in favour of good research rather than the density of the authors' professional network. Or at least in mitigating the poisonous 'publish or perish' rules that lead to this sort of phenomena.
> the author notes that it is well-known that the quality of the work has little to do with its odds of acceptance even when the system is 'functioning' as designed
I think this is a little more pessimistic than what the piece says. The NeurIPS (then-NIPS) experiment said that about 60% of papers accepted by one PC got rejected by the other. That doesn't actually mean "the quality of the work has little to do with its odds of acceptance". It may just be that there's a paper has to cross a quality threshold, and once it's past, then the outcome has a lot more variation.
My personal take on NeurIPS specifically is that there's a fraction of bad papers, maybe 40%, that probably shouldn't and won't get in. Then there's a minority of very nice papers that probably should and will get in, maybe 5-10%. And then there are a bunch of middling papers where a lot of it is luck and drawing friendly reviewers. But these aren't bad papers, and you can't really just churn them out, they're just not very good papers.
Seems it's also rational for an opportunistic player of the publications game to split their work into the maximal number of marginally acceptable publications, which would result in a high rejection %.
I agree with one of the other posters, the result of the study does not highlight a problem that the acceptance of the work is unrelated to the quality. It's more a fact of the distribution of the "quality".
Essentially if you look at review scores for a conference which has say a 40% accept rate (and this is quite similar across fields I'd imagine), you find there's 10-20% (depending on conference) of papers that are clear reject for all reviewers, then there's probably around 10%-15% of papers which are very clear rejects now the rest of the papers are very similar in scores so the cut-off becomes quite arbitrary (and depends on luck as well). This is actually well known for grant applications and a sign that there is likely not enough money in the system.
The bigger issue here, and what is threatening the integrity of the research, is blind reliance on conference publication count as a proxy for research quality.
Maybe it's time to move on from some of these conferences, and focus on interactions that maximize sharing research findings. I know that is unrealistic, but like every other metric, conference acceptance ceases to have value once the metric itself is what people care about.
I often go back to this talk from Michael Stonebraker about, in his terms, the diarrhea of papers [1]. It's difficult to justify time towards anything that doesn't lead to a near-term publication or another line on your CV.
A really awesome talk - it's nowhere close to my field of research, but it is oh so relevant anyways, thank you for linking it and I'd recommend everyone interested in this larger topic to listen to what Stonebraker had to say on this.
And one of the suggested solutions (at ~22:00) seems that it would work if it can be adopted - essentially, have top university administrators evaluate top x papers for hiring/tenure decisions and ignore everything beyond that number. What you measure is what you get; if you measure count, you get a deluge of 'least publishable units', if you measure your top 3 papers, then everyone will focus on quality instead of quantity. A counterargument probably is that it's easier for the administors to measure quantity in a way that seems objective and resistant to any arguments or appeals decisions, and it's far harder to objectively compare quality especially if the candidates are from different subfields of research.
I made a similar comment in another thread here, but the problem is that there will always be non-experts who want to assess the quality of research being done in different fields. These might be deans, department heads, VPs of R+D labs, and so on. Ideally, all research would have clear measurable impact, but a lot of good research doesn't, at least not immediately, so there needs to be some way for non-experts to measure quality. Conference/journal publications and citation counts are highly imperfect solutions, but I'm not sure what the better candidates are.
Assessment can never be done by non-experts. They do not understand the subject matter and cannot even judge which journals are good. It just makes no sense.
When someone is hired (whether for tenure or time-limited), their research as a whole has to be evaluated by external, independent committees who take into account the content of the research and do not base their judgment on indicators only. There is no shortcut around that.
The biggest annoyance nowadays is the decision-makers' insistence on "excellence", though. You cannot have only excellent people everywhere, as per the definition of "excellent", yet this demand is in every fucking guideline for postdocs and tenure-track position. It's absolutely ridiculous.
The better candidate is spending more on the evaluation.
E.g, for my company's twice-yearly evaluation, everybody writes up a short report on the most impactful stuff they've done including evidence, this is evaluated by their manager to give a score, and then there's a series of group meetings between managers to make sure that the scores are calibrated, including looking at all types of metrics that can be dug up and comparing to our written role descriptions for different levels. It takes a lot of time but creates fair scores.
This is extremely labor intensive, but that's the thing: To create anything resembling fair evaluation of a large group of people that do a large set of different things, you need to do things that are labor intensive. Using a simple set of metrics don't cut it.
The department head at least should be something of an expert, and be able to consult with more specialized experts. Then that department head can pass on an evaluation to management.
The playing field where one can share research findings is far more uneven, more tilted in favor of the prestigious labs and individuals more than the current field of conferences and the blind review process.
Findings and papers from well known individuals (read: twitter accounts) do get far more attention , more citations. Of course, one can argue that, broadly, well known labs and individuals are wel known because of their tendency to do great work, write better papers. And that’s true. However, the above still holds, in my experience as PhD student in ML. Anecdotally, I have seen instances where a less interesting paper from a renowned lab got more attention and eventually more citations than a better paper accepted at the same venue by a less renowned lab on the same topic.
I completely agree with you and this is a significant issue, because it makes it hard for someone not from one of the established places to get research attention.
I would also argue that with the increased importance of the (mostly) commercial "high-impact" journals this has become worse. I know that some of the professional (non-expert) editors of these journals specifically look at the citation counts of authors before accepting to send them out to review, because their main aim is to get people reading the articles, not necessarily good science.
- Conferences and participants have increased exponentially over the last few years.
- Students in AI/ML graduate programs have similarly increased.
- Huge numbers of companies are hiring AI/ML graduates.
- What constitutes a true advance AI/ML is difficult to determine. Deep learning is fairly ad-hoc method - a tweak to an existing method that allows you exceed soto (state of the art) is the simplest way to get attention. But that's not a method and so you're many others pushing similar tweaks also.
This environment in particular seems like it would exacerbate all the ordinary pressures to cheat found in the academic environment. It has something of the quality of the last blow-out of a bubble. And the thing is that even with deep learning being real, the dynamics seem fated to push things to the point that expectation are sufficiently far past reality that the whole collapses, for a bit.
I'm not in academia, but in the grand tradition of "why don't you just..." solutions crossed with "technical solutions to people problems":
Would it help at all if rather than participants reviewing 3 papers, each reviewed 2 papers and validated the review of 3 more papers?
This is computer science here, with things like the set NP whose defining characteristic is that it's easier to check a solution than generate it.
I'm imagining having some standard that reviews are held to in order to make them validatable. When validating a review, you are just confirming that the issues brought up are reasonable. Same for the compliments.
Sure, it's not perfect because the validators wouldn't dive in as deep or have as much context as the reviewers, but sitting here in my obsidian tower of industry, it seems like it would at least make collusion attacks more difficult. Hopefully without increasing the already heavy load on reviewers.
(It very much seems like an incomplete solution -- we only have to look at politics and regulatory capture to see how far wrong things can go, in ways immune to straightforward interventions. Really, you need to tear down as many of the obstacles to a culture of trust as you can. Taping over the holes in a leaking bucket doesn't work for long.)
That would only work if the review decisions could be expected to be reasonably consistent from person to person. But, from the article:
> In a well-publicized case in 2014, organizers of the Neural Information Processing Systems Conference formed two independent program committees and had 10% of submissions reviewed by both. The result was that almost 60% of papers accepted by one program committee were rejected by the other, suggesting that the fate of many papers is determined by the specifics of the reviewers selected and not just the inherent value of the work itself.
With this much demonstrated discrepancy between two sets of reviewers, it’s hard to believe that adding a validation step would produce a consistent improvement. How can people be expected to find improperly accepted papers when they have less than 50% agreement on the acceptance of good-faith submissions?
Honestly I think this seeming randomness in acceptance is at the heart of why people might think cheating is acceptable. If the process is not reliable, why bother submitting to it?
The described result is not in conflict with decisions being reasonably consistent.
Suppose that two reviewers independently rank papers 80% on quality and 20% on chance factors. With good odds, the two reviewers will agree with each other on the relative rankings of any given pair of papers. But their lists of the top 10% of papers will largely not be in agreement with each other.
>The result was that almost 60% of papers accepted by one program committee were rejected by the other, suggesting that the fate of many papers is determined by the specifics of the reviewers selected and not just the inherent value of the work itself.
I referenced Kahneham's latest book, Noise, above but this is exactly the problem he focuses on. There are solutions.
This is in aggregate. If you look at great papers, they are almost surely accepted. Bad papers are almost surely rejected. But those in the middle are a coin toss.
The problem with Academic collusion rings is that eventually the ring can hold such influence that all "major" research comes from the ring. As ring members benefit there is no reason for them to switch to an alternate system. If one wants to progress in their chosen field then there is no benefit to publishing in an unread source.
I think this is a really interesting idea. It does seem more effective for validating that reviewers are not torching the work of others — is there a nice way to validate that reviewers are not giving their friends a pass, short of reviewing the manuscript and the review?
Back in grad school a colleague of mine spent nine months on an experiment in a new field and submitted it as a paper to a quality journal. Six months later, the paper was rejected for lack of novelty: One of the reviewers had found a paper with a figure-by-figure duplication of the same experiment -- published on arXiv a week before the rejection decision. Both the managing editor and the author on the arXiv paper were from Chinese universities.
We wrote a rebuttal and submitted a complaint to the journal editor, but no justice was forthcoming. My colleage switched research directions to avoid the collusion and now takes pains not to submit papers without a coauthor who has enough clout in the field to deter blatant research theft. He also avoids dealing with editors from institutions in China.
He ended up graduating two years later than planned.
Reading about that makes me very angry. I long to see some justice for your colleague. Lots of good suggestions to guard with arXiv, I hope many young academics coming through read this post and learn from it.
I wish I had something more substantive to add, but, all I can say is I hope your colleague knows that even if the swindlers and plagiarizers take our research, years of our lives, etc, be they in China or wherever the whole wide world else ain't able to take away the heart of a real one. They know the research ain't theirs, and they know they depend on real producers to be able to commit their crimes, or do anything actually useful and it's not the other way around. These plagiarists are parasites, and one day we'll be free of them.
As long as metrics like impact factor and citation count determine the trajectories of academic careers, plagiarism, collusion, and other forms of academic dishonesty will not go a way.
The fact of the matter is that most academics do not have time to dig into whether someone's work is intellectually dishonest. Being dishonest has a huge payoff as long as you don't cause a scandal.
Anyone who plans to stay in academia should understand that the metagame has changed over the last 50 years. Largely because human attention has not scaled with rate at which academics are exposed to and expected to assimilate new information. There is a larger payoff to exploiting the lack of attention than to earnestly carrying out some meaningful research program. At the very least, do both. By no means do only the latter.
The other way to mitigate this is to put your work on ArXiv before submitting it to conferences/journals, which is becoming more and more acceptable in at least some CS sub-fields.
This is becoming more popular in another field (non-CS) I follow, but for different reasons:
Some have been posting their papers to pre-print servers and then skipping straight to commercialization attempts. This is especially concerning in the health and fitness world, where some supplement makers and fitness gurus are uploading documents to pre-print servers to give the illusion of being published authors. Casual observers may not be able to tell the difference between published, peer-reviewed papers and some random document uploaded that has a DOI on a pre-print server.
This doesn’t carry much weight in academia, but it can fool non-academic observers. I’m not sure if or how it will translate to CS papers, but I wouldn’t be surprised if skipping peer review becomes more common as the pace of publishing increases.
Is it possible to upload to arXiv but have it as private, but have the upload date reserved. Then if something like this happens, you can turn it to public and show that yours was uploaded first. Obviously arXiv itself would need to be a trusted middleman and guarantee that the system isn't being cheated.
The journal already had documented evidence that the legitimate paper was submitted to the journal before the plagerism was submitted to ArXiv. I don't know why they would trust an ArXiv timestamp more than their own.
Can you give some more details? What field was that in? Can you maybe point to the arxiv paper? What was the journal your colleague submitted to?
I'm trying to understand your story because many details are very different from the things I know about scientific publishing.
Your colleague was a grad student who researched and conducted an experiment in a new field without a supervisor (you say there wasn't a coauthor with enough research cloud)? This never happens in my field and pretty much any technical field I'm aware of.
It then went to the review process and was rejected six month later because a reviewer found (a "copy" of) the paper on arxiv? And the rejection reason was lack of novelty? In many fields arxiv does not count as a "already published" . In particular if the submission date of the article is before the article showing up on arxiv.
Also you say the article on arxiv was very obviously copied. So did this result in plagiarism investigations? While there are many things wrong with the current review process, accusations of instances of plagiarism are typically dealt with very quickly and in my experience pretty much always results in the editors in chief getting involved. So did this happen? Also in my experience there is generally much more scepticism against chinese authors than western authors when this happens. The quality of research and publications from China has dramatically increased in the last 5-10 years though.
The comment you're replying to was quite detailed. Any additional data would identify the paper and therefore the person, transforming OP's comment from an anecdote into a use of HN for griefing purposes. Let's not do that.
While I've never had an experience that bad, I've certainly peer reviewed utter garbage from one Chinese professor, who was publishing something like 200 papers a year.
I don't care how many graduate students you have, there is no way all of that is original research. And in my case, it was an obfuscated version of a very well known theory that any freshman would know.
This is the fundamental flaw with open research and open source. You can't have both open access and working intellectual property protections, especially when there are jurisdictions that don't care about your laws, or are even actively encouraging theft.
It's easy to complain about China but that's what the Mainland Chinese system is designed to do. If they're able to break ours then that's just survival of the fittest. Adopt of perish. Who says you have to roll over and simply let them get away with it? Why are our universities collaborating with these "researchers"? Name, shame and blacklist them.
The only appropriate response to that is to go scorched earth and try and get not just those to published the copied paper but the reviewer who linked it fired. And then make it very clear what the journal let happen to others in the field.
It’s unclear that stealing someone’s scientific ideas is a copyright violation. Though it may depend on exactly what was copied and how.
> In no case does copyright protection for an original work of authorship extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery, regardless of the form in which it is described, explained, illustrated, or embodied in such work
EU has an eIDAS law about qualified signatures and timestamps for years with even an original open-source implementation from EU. You can add it to the PDF and keep it on your machine. It's verifiable by any court in the EU because it's "unbreakable". We use it for every document when communicating with officials and I think more people should do that. You don't need cool blockchains and stuff when you have cryptography.
> Back in grad school a colleague of mine spent nine months on an experiment in a new field and submitted it as a paper to a quality journal. Six months later, the paper was rejected for lack of novelty: One of the reviewers had found a paper with a figure-by-figure duplication of the same experiment -- published on arXiv a week before the rejection decision. Both the managing editor and the author on the arXiv paper were from Chinese universities.
Could you clarify whether the duplicate paper was also submitted to the same journal or elsewhere, and if submitted whether it was accepted?
I'm trying to figure out whether the point was to steal credit, or to spike your colleague's submission.
> We wrote a rebuttal and submitted a complaint to the journal editor, but no justice was forthcoming.
Yeah this is the sort of thing that seems to only ever get resolved if a public stink is made, often on social media these days, because the integrity of both the editor and the journal is being implicated, which ends up with the can of worms[0] being swept under the rug if at all possible.
Of course people are going to keep tripping over the bump in the rug, and your colleague might not even have been the first victim.
[0] Having to check everything else the editor has ever done for similar misconduct, potentially retractions galore, making victims whole, process improvements, etc.
But these things can affect real credibility if not superficial credibility.
I wonder if it's worth staring a web-site where friendly scientists put papers side by side and say 'Was this a Rip Off'? Whereby at least the scammers get some form of possible public shaming for as long as the paper exists.
I can just imagine that site coming up in web searches ... how could people not click it?
I wonder if there's already a tool that clusters papers by similarity or else how do the reviewers reject a paper on such ground. How do they find it? Do they know them all by heart? Maybe we (the community) can help by creating an open platform that scraps openly available papers and automatically finds and sorts similar ones.
Why can’t academics insert some sort of cryptographic proof of ownership or publication date? The ownership could be done with something like Keybase, and as far as dates, that could be as simple as private Git repo that you make public or some fancy blockchain solution. But I suppose even then, people could look the other way.
> Back in grad school a colleague of mine spent nine months on an experiment in a new field and submitted it as a paper to a quality journal. Six months later, the paper was rejected for lack of novelty: One of the reviewers had found a paper with a figure-by-figure duplication of the same experiment -- published on arXiv a week before the rejection decision. Both the managing editor and the author on the arXiv paper were from Chinese universities.
I have no experience in this area, and I am probably going to ask a dumb question: why not always submit the paper to arXiv before submitting it to a journal to protect against this kind of theft? Wouldn't that clearly and indisputably establish priority?
What happens if people start colluding behind the scenes of arXiv to change their records? Even if the people running it now are totally trustworthy, they won't be in control forever.
It wasn't Nature Materials, but it was one of the top journals for that branch of materials science. I think we were shooting for IF 5 to 10, but I wasn't involved in the work beyond editing the rebuttal.
Part of the problem is that people seem to want an objective evaluation of a piece of research, and measure the value for taxpayer money. Well, you can’t get an objective evaluation. It’s all subjective. And “impact” as a quantifiable entity in general is nonsense, for one thing the timescales prohibit its measurement at the point where it may be useful.
The solution is to use management. Lots of people object here and say “but nepotism, favouritism” and yep that’s a problem, but it is less of a problem that the decline of western universities. You can circumvent it somewhat by rotation, by involving external figures, by a hierarchy that ensures people are not colluding, but ultimately you just have to trust people and accept some wastage.
People aren’t in academia for the money. It’s a vocation. You’re not going to have many people milking the system. Things went pretty well before the metric culture invaded the academy. They can go well again.
Before introducing the KPIs, a majority of polish science was basically people milking the system and doing barely any (valueable) research. It was seen as an easy, safe and ok paying job where the only major hassle is having to teach the students. You often needed connections to get in. It was partially like that because of the communist legacy, where playing ball with the communist party was the most important merit for promotion, which, over the course of 45 years (the span of communism in Poland), filled the academia management ranks with conformist mediocrities.
Now, after a series of major reforms, there's a ton of KPIs, and people are now doing plenty of makework research to collect the required points, but still little valueable work gets done. Also, people interested in doing genuine science who would be doing it under the old system are now discouraged from joining academia, because in the system they're expected to game the points system and not to do real work.
What is the lesson from this is? Creating institutionalized science is hard? It requires a long tradition and scientific cultural standards and can't be just wished into place by bureaucrats? Also, perhaps it's good to be doing the science for some purpose, which in the US case are often DoD grants, where the military expects some practical application. This application may be extremely distant, vague and uncertain (they fund pure math research!), but still, they're the client and they expect results. Whereas the (unstated) goal of science in Poland seems to be just to increase the prestige of Polish science and its Universities by getting papers into prestigious journals, whereas the actual science being done doesn't matter at all - basically state-level navel gazing.
Relying on altruistic tendencies for people in academia is not adequate. Everyone starts out in academics as school children, and get filtered out or filter themselves out pursuing other things. Those who remain will be the ones who love to learn and teach, those who just cannot accept loss/failure, and, sadly, those who are afraid of change. The more competitive the field becomes, the harder it is to succeed, the more we select for the hyper-competitive or fearful over the altruistic.
If you start cheating the metrics, or optimizing a lot towards them, it becomes counter-productive when they change. As such, the most efficient way forward would be to work without trying to optimize for a temporary metric. On the flip side, it would be troublesome to convince people to different, complex forms each time.
What first gave me the idea was the concept of "lubricating" headers (submitting some with random values) for future http protocols, to combat "ossification", where middle boxes start to meddle with them and become obsolete when they don't recognize the new fields, instead of transmitting them.
There is only one reasonable way to evaluate research and researchers, that's to evaluate the content of their work and publications by external evaluation panels and tell these panels explicitly that they should not base their assessments solely on indicator counting, but on the overall merits, originality, and prospects of the research according to their subjective opinion. Metrics shouldn't even be used as a tie-breaker, they should only ever be used as weak indicators, and this must be explicitly mentioned in the guidelines.
In addition, you need a few other guidelines and laws. For example, it must be prohibited that someone becomes a postdoc at the same place where they obtained their Ph.D. We have people who study at university X, do their exams at university X, do their Ph.D. at university X under the same professors they always knew, then become postdocs working for their former supervisors and being exploited by them (teaching the same bad seminar their former supervisor teaches the past 20 years), and then get tenure in a rigged call. And the worst thing about it is that they feel entitled to all of this.
You've got to break this vicious cycle, but with your suggestion of using a methodology that worked in 1950s (with an order of magnitude less candidates) this could never be achieved.
I think external evaluation panels (and in particular, from a third country) are the way to go. We already have good examples, for example, in ERC grant panels. The ERC uses exactly the strategy you mention and it has an impeccable reputation, I know many people who applied with or without success but I know no one who felt treated unfairly.
But I'm against blanket rules prohibiting postdocs or positions at the same place of the PhD, at least from my Southern European (Spanish) point of view. This is often touted in Southern European countries because in the US no one does that so it must be evil, so clearly we should ban it to be more like Americans and see if this makes our scientific productivity closer to theirs. But European (especially Southern European) culture is not US culture. People want to settle down, be near their loved ones, and there's nothing intrinsically bad about that that should be punished. Plus, the job market is much less dynamic so even for those who don't mind bumping around, it can be hard to reconcile with a possible spouse who has a career too. And finally, if you push people in this region to move, most of the time the result will be that they end up in a Northern European country (or the US, Canada, etc.) where they make 3x or 4x more, and never come back - once you have experienced a much better salary it's hard to justify returning, I have seen it plenty of times.
Bring on the external evaluation panels, and then there will be no need for any measure forcing people to move, which would reduce inclusivity and thus the talent pool.
Independent of every other point, this is in my point of view a real problem. It happens so often and I don't understand why there's often no law that at least a postdoc at another institution has to be done.
Not sure what you mean by that, but KPIs are generally put in place by high-level management. Or do you want more micromanagement?
Either way, I think the solution is not more control, quite the opposite. I think the solution is just to remove the extrinsic incentives.
Some people say UBI will cause people to do nothing and that is probably true, but the flip side is that the output of the remaining people will likely be many times higher both in volume and quality (with the total volume much lower but higher quality). Not having their energy completely destroyed by all the busy work necessary to show they are working.
(Speaking about Spain).
The KPIs are there because OFFICIALLY (and this is strictly so) you are not even allowed to get a tenured position without an absurd number of (in Maths) JCR papers IN THE FIRST QUARTILE.
This is so stupid it is not even funny but how can you fight that when your PhD students depend on those metrics?
There was a time in history when tenure made sense, but today the tenure track process forces a lot of people to go after low-hanging fruit that has a high probability of being accepted for publication, instead of trying things that are meaningful to try but may fail.
That won't happen, mostly because employers have outsourced education and vetting (in the form of requiring bachelor/master degrees) to universities (and the associated costs to governments and/or students who pay tuitions) instead of the old style vocational training/apprenticeship system where the employers had to pay.
Want to restore academia to only those actually interested in science? Make employers pay a substantial tax on jobs requiring academic degrees.
You comment does not seem to contain any explanation as to why 'using management' would solve the problem you allude to. Can you elaborate?
In short, I think it is definitely clear neither the citation nor the journal brand is the best proxy for the worthiness but the system you are proposing is worse while still reliant on subjective judgement.
It's not really about judging people either. When it comes to choosing which people to reward with jobs, promotions, grants, and prizes, we already know the solution: expert panels that spend nontrivial time with each application. Sometimes there are political or administrative reasons that override academic excellence, but in general the academia has figured out how to evaluate shortlisted people.
The real problem is shortlisting people. For example, when a reputable university has an open faculty position, it typically gets tens of great applications. Because the people evaluating the applications are busy professors, they need a way of throwing away most of the applications with minimal effort. From the applicant's perspective, this means you only get a chance if the first impression makes you stand out among other great applicants. And that's why it matters that you publish in prestigious conferences/journals, come from a prestigious university, and have prestigious PhD/postdoc supervisors.
Well, science doesn't care that much for KPIs, per se. It's more that the managers want numbers to steer by.
In academia, getting promoted means more management tasks. So higher up academics have been indoctrinated to want numbers. Is the scourge of management.
As such: not a big fan of your solution.
I agree overall with what you wrote, but have to comment here, because I think this is already not the case in many settings. I can only speak for the US, but in my experience with some other places overseas similar issues are developing.
There are many legitimate hypotheses for why this is the case, but in general at many universities, as far as climbing the academic ladder is concerned, publication metrics are no longer relevant. That is, some baseline is required, but beyond that, most of the focus is on money and grant sizes. I've been in promotion meetings discussing junior faculty that are not publishing and this is brushed aside because they have large grants. I've also repeatedly heard sentiments to the effect of "papers are a dime a dozen, grant dollars distinguish good from bad research."
Again, there's lots of reasonable opinions about this, but I've come to a place where I've decided this is incentivizing corruption. Good research is only weakly correlated with its grant reimbursement, and regardless, it's lead to a focus on something only weakly associated with research quality. Discussions with university staff where you're openly pressured to artificially inflate costs to bring in more indirect funds should raise questions. Just as it's apparent that incentivizing (relatively) superficial bibliometric indices like publication count or h factors leads to superficial science, incentivizing research through grant money has the same effect, but differently.
So yes, going into academics is not the way to make money if that's what you want. However, I think nowadays in the US, it's very much all about the money for large segments, who are milking the system right now at this moment.
Also, in theory, yes, management is the solution, but really management is how we've gotten into this mess. Good management, yes, bad management no. But how do you insure the former?
Fixing this mess academics has slid down (in my perception, maybe everything really is fine) will require a lot of changes that will be controversial and painful to many, and I don't think there's a single magic bullet cure. Eliminating indirect funds is probably one thing, funding research through different mechanisms is another, maybe lotteries, probably opening up grant review processes to the general public. Maybe dissociating scientific research from the university system even more so than has been the case is also necessary. Maybe incentivizing a change in university administration structures. Probably all of the above, plus a lot else.
How to get things to go well again is achievable in theory but how to get there is less clear given the amount of change involved.
Once it's no longer about being in the esteemed and scarce "10%", they won't bother because they don't need to. Imagine a process where the only criteria are technical soundness and novelty, and as long as minimal standards are met, it's a "go". Call it the "ArXiv + quality check" model.
Neither formal acceptance to publish nor citation numbers truly mark scientific excellence; perhaps, winning a "test of time award" does, or appearing in a text book 10 years later.
I've been reviewing occasionally since ~1995, regularly since ~2004, and I've never heard of collusion rings happening in my sub-area of CS (ML, IR, NLP). I have caught people submitting to multiple conferences without disclosing it. Ignoring past work that is relevant is common, more often our of blissful ignorance, and occasionally likely with full intent. I'm not saying I doubt the report, but I suspect the bigger problem that CS has is a large percentage of poor-quality work that couldn't be replicated.
BTW, the most blantant thing I've heard of (from a contact complaining about it on LinkedIn) is someone had their very own core paper from their PhD thesis plagiarised - submitted to another conference (again) but with different author names on it... and they even cited the real author's PhD thesis!
[1] https://ieeexplore.ieee.org/abstract/document/6746236 [2] https://www.youtube.com/watch?v=MFNFScqN47o
I mean I could buy your paper but I would have to know it by heart and understand it in order to defend it.
At least that's how it went when I was studying applied physics (1974-1977).
If I had just bought a paper I would have had a really hard time in the viva voce.
Edit: by -> buy
More info here: https://michaelnielsen.org/blog/three-myths-about-scientific...
I think the recent wave of low-impact submissions and co-authorship rings is the result of developing countries trying to simplify that process and tying hiring/promotion/pay directly to publication count and related easy metrics.
One possible issue is that researchers usually need to justify their research to somebody who's not in their field. Conferences are one way to do this. So are citation counts. Both are highly imperfect, but outsiders typically want some signal that doesn't require being an expert in a person's chosen field. The "Arxiv + quality check" model doesn't seem to provide this.
> I suspect the bigger problem that CS has is a large percentage of poor-quality work that couldn't be replicated.
As a sort of ML researcher for several years, I agree.
Also, a "best 10%" conference/journal is valuable -- I have only so much time in the day. There are a few conferences for which it is always a good use of my time to read all the abstracts, plus one or two of the papers that seem most interesting. I can't do that for every conference, or even most conferences, in my area.
So the "best 10%" conference/journal is valuable to the consumers, and the prestiege is valuable to the producers. Therefore I think such a thing would simply re-emerge if you somehow killed it.
Moreover, the correlation with acceptance and impact in existing, but not that high: https://medium.com/ai2-blog/what-open-data-tells-us-about-re...
Of course, only corporate researchers can rely on not publishing in established journals - as their salary and position does not come from "publish or perish" metrics. Ironically, it means that there is more academic freedom in private companies than in academia.
I've been thinking on an off about review system that might improve on things. I'm imagining perhaps: reviewers and authors both get to see who they are, and conflicts of interest can be called out by other people after review and before publication; reviewers are chosen at random, not allowed to bid; reviewers are sent a series of pairs of papers, and asked to choose which one they'd rather see, scores and ultimately publication can be decided by rank choice vote rather than reviewer assignment; comments on paper improvement would be completely optional. Would this be better or worse than what we have? Would it deter explicit collusion?
I think this is a little more pessimistic than what the piece says. The NeurIPS (then-NIPS) experiment said that about 60% of papers accepted by one PC got rejected by the other. That doesn't actually mean "the quality of the work has little to do with its odds of acceptance". It may just be that there's a paper has to cross a quality threshold, and once it's past, then the outcome has a lot more variation.
My personal take on NeurIPS specifically is that there's a fraction of bad papers, maybe 40%, that probably shouldn't and won't get in. Then there's a minority of very nice papers that probably should and will get in, maybe 5-10%. And then there are a bunch of middling papers where a lot of it is luck and drawing friendly reviewers. But these aren't bad papers, and you can't really just churn them out, they're just not very good papers.
From memory, it was 25% rejected by both PCs, 15% accepted by both PCs, and the middle 60% random.
Essentially if you look at review scores for a conference which has say a 40% accept rate (and this is quite similar across fields I'd imagine), you find there's 10-20% (depending on conference) of papers that are clear reject for all reviewers, then there's probably around 10%-15% of papers which are very clear rejects now the rest of the papers are very similar in scores so the cut-off becomes quite arbitrary (and depends on luck as well). This is actually well known for grant applications and a sign that there is likely not enough money in the system.
Maybe it's time to move on from some of these conferences, and focus on interactions that maximize sharing research findings. I know that is unrealistic, but like every other metric, conference acceptance ceases to have value once the metric itself is what people care about.
[1] https://youtu.be/DJFKl_5JTnA?t=853
And one of the suggested solutions (at ~22:00) seems that it would work if it can be adopted - essentially, have top university administrators evaluate top x papers for hiring/tenure decisions and ignore everything beyond that number. What you measure is what you get; if you measure count, you get a deluge of 'least publishable units', if you measure your top 3 papers, then everyone will focus on quality instead of quantity. A counterargument probably is that it's easier for the administors to measure quantity in a way that seems objective and resistant to any arguments or appeals decisions, and it's far harder to objectively compare quality especially if the candidates are from different subfields of research.
When someone is hired (whether for tenure or time-limited), their research as a whole has to be evaluated by external, independent committees who take into account the content of the research and do not base their judgment on indicators only. There is no shortcut around that.
The biggest annoyance nowadays is the decision-makers' insistence on "excellence", though. You cannot have only excellent people everywhere, as per the definition of "excellent", yet this demand is in every fucking guideline for postdocs and tenure-track position. It's absolutely ridiculous.
E.g, for my company's twice-yearly evaluation, everybody writes up a short report on the most impactful stuff they've done including evidence, this is evaluated by their manager to give a score, and then there's a series of group meetings between managers to make sure that the scores are calibrated, including looking at all types of metrics that can be dug up and comparing to our written role descriptions for different levels. It takes a lot of time but creates fair scores.
This is extremely labor intensive, but that's the thing: To create anything resembling fair evaluation of a large group of people that do a large set of different things, you need to do things that are labor intensive. Using a simple set of metrics don't cut it.
Findings and papers from well known individuals (read: twitter accounts) do get far more attention , more citations. Of course, one can argue that, broadly, well known labs and individuals are wel known because of their tendency to do great work, write better papers. And that’s true. However, the above still holds, in my experience as PhD student in ML. Anecdotally, I have seen instances where a less interesting paper from a renowned lab got more attention and eventually more citations than a better paper accepted at the same venue by a less renowned lab on the same topic.
I would also argue that with the increased importance of the (mostly) commercial "high-impact" journals this has become worse. I know that some of the professional (non-expert) editors of these journals specifically look at the citation counts of authors before accepting to send them out to review, because their main aim is to get people reading the articles, not necessarily good science.
- Conferences and participants have increased exponentially over the last few years.
- Students in AI/ML graduate programs have similarly increased.
- Huge numbers of companies are hiring AI/ML graduates.
- What constitutes a true advance AI/ML is difficult to determine. Deep learning is fairly ad-hoc method - a tweak to an existing method that allows you exceed soto (state of the art) is the simplest way to get attention. But that's not a method and so you're many others pushing similar tweaks also.
This environment in particular seems like it would exacerbate all the ordinary pressures to cheat found in the academic environment. It has something of the quality of the last blow-out of a bubble. And the thing is that even with deep learning being real, the dynamics seem fated to push things to the point that expectation are sufficiently far past reality that the whole collapses, for a bit.
Would it help at all if rather than participants reviewing 3 papers, each reviewed 2 papers and validated the review of 3 more papers?
This is computer science here, with things like the set NP whose defining characteristic is that it's easier to check a solution than generate it.
I'm imagining having some standard that reviews are held to in order to make them validatable. When validating a review, you are just confirming that the issues brought up are reasonable. Same for the compliments.
Sure, it's not perfect because the validators wouldn't dive in as deep or have as much context as the reviewers, but sitting here in my obsidian tower of industry, it seems like it would at least make collusion attacks more difficult. Hopefully without increasing the already heavy load on reviewers.
(It very much seems like an incomplete solution -- we only have to look at politics and regulatory capture to see how far wrong things can go, in ways immune to straightforward interventions. Really, you need to tear down as many of the obstacles to a culture of trust as you can. Taping over the holes in a leaking bucket doesn't work for long.)
> In a well-publicized case in 2014, organizers of the Neural Information Processing Systems Conference formed two independent program committees and had 10% of submissions reviewed by both. The result was that almost 60% of papers accepted by one program committee were rejected by the other, suggesting that the fate of many papers is determined by the specifics of the reviewers selected and not just the inherent value of the work itself.
With this much demonstrated discrepancy between two sets of reviewers, it’s hard to believe that adding a validation step would produce a consistent improvement. How can people be expected to find improperly accepted papers when they have less than 50% agreement on the acceptance of good-faith submissions?
Honestly I think this seeming randomness in acceptance is at the heart of why people might think cheating is acceptable. If the process is not reliable, why bother submitting to it?
Suppose that two reviewers independently rank papers 80% on quality and 20% on chance factors. With good odds, the two reviewers will agree with each other on the relative rankings of any given pair of papers. But their lists of the top 10% of papers will largely not be in agreement with each other.
I referenced Kahneham's latest book, Noise, above but this is exactly the problem he focuses on. There are solutions.
Back in grad school a colleague of mine spent nine months on an experiment in a new field and submitted it as a paper to a quality journal. Six months later, the paper was rejected for lack of novelty: One of the reviewers had found a paper with a figure-by-figure duplication of the same experiment -- published on arXiv a week before the rejection decision. Both the managing editor and the author on the arXiv paper were from Chinese universities.
We wrote a rebuttal and submitted a complaint to the journal editor, but no justice was forthcoming. My colleage switched research directions to avoid the collusion and now takes pains not to submit papers without a coauthor who has enough clout in the field to deter blatant research theft. He also avoids dealing with editors from institutions in China.
He ended up graduating two years later than planned.
I wish I had something more substantive to add, but, all I can say is I hope your colleague knows that even if the swindlers and plagiarizers take our research, years of our lives, etc, be they in China or wherever the whole wide world else ain't able to take away the heart of a real one. They know the research ain't theirs, and they know they depend on real producers to be able to commit their crimes, or do anything actually useful and it's not the other way around. These plagiarists are parasites, and one day we'll be free of them.
As long as metrics like impact factor and citation count determine the trajectories of academic careers, plagiarism, collusion, and other forms of academic dishonesty will not go a way.
The fact of the matter is that most academics do not have time to dig into whether someone's work is intellectually dishonest. Being dishonest has a huge payoff as long as you don't cause a scandal.
Anyone who plans to stay in academia should understand that the metagame has changed over the last 50 years. Largely because human attention has not scaled with rate at which academics are exposed to and expected to assimilate new information. There is a larger payoff to exploiting the lack of attention than to earnestly carrying out some meaningful research program. At the very least, do both. By no means do only the latter.
Some have been posting their papers to pre-print servers and then skipping straight to commercialization attempts. This is especially concerning in the health and fitness world, where some supplement makers and fitness gurus are uploading documents to pre-print servers to give the illusion of being published authors. Casual observers may not be able to tell the difference between published, peer-reviewed papers and some random document uploaded that has a DOI on a pre-print server.
This doesn’t carry much weight in academia, but it can fool non-academic observers. I’m not sure if or how it will translate to CS papers, but I wouldn’t be surprised if skipping peer review becomes more common as the pace of publishing increases.
I'm trying to understand your story because many details are very different from the things I know about scientific publishing.
Your colleague was a grad student who researched and conducted an experiment in a new field without a supervisor (you say there wasn't a coauthor with enough research cloud)? This never happens in my field and pretty much any technical field I'm aware of.
It then went to the review process and was rejected six month later because a reviewer found (a "copy" of) the paper on arxiv? And the rejection reason was lack of novelty? In many fields arxiv does not count as a "already published" . In particular if the submission date of the article is before the article showing up on arxiv.
Also you say the article on arxiv was very obviously copied. So did this result in plagiarism investigations? While there are many things wrong with the current review process, accusations of instances of plagiarism are typically dealt with very quickly and in my experience pretty much always results in the editors in chief getting involved. So did this happen? Also in my experience there is generally much more scepticism against chinese authors than western authors when this happens. The quality of research and publications from China has dramatically increased in the last 5-10 years though.
I don't care how many graduate students you have, there is no way all of that is original research. And in my case, it was an obfuscated version of a very well known theory that any freshman would know.
It's easy to complain about China but that's what the Mainland Chinese system is designed to do. If they're able to break ours then that's just survival of the fittest. Adopt of perish. Who says you have to roll over and simply let them get away with it? Why are our universities collaborating with these "researchers"? Name, shame and blacklist them.
Dead Comment
Also, a DMCA request to arXiv would likely work.
> In no case does copyright protection for an original work of authorship extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery, regardless of the form in which it is described, explained, illustrated, or embodied in such work
Could you clarify whether the duplicate paper was also submitted to the same journal or elsewhere, and if submitted whether it was accepted?
I'm trying to figure out whether the point was to steal credit, or to spike your colleague's submission.
> We wrote a rebuttal and submitted a complaint to the journal editor, but no justice was forthcoming.
Yeah this is the sort of thing that seems to only ever get resolved if a public stink is made, often on social media these days, because the integrity of both the editor and the journal is being implicated, which ends up with the can of worms[0] being swept under the rug if at all possible.
Of course people are going to keep tripping over the bump in the rug, and your colleague might not even have been the first victim.
[0] Having to check everything else the editor has ever done for similar misconduct, potentially retractions galore, making victims whole, process improvements, etc.
I wonder if it's worth staring a web-site where friendly scientists put papers side by side and say 'Was this a Rip Off'? Whereby at least the scammers get some form of possible public shaming for as long as the paper exists.
I can just imagine that site coming up in web searches ... how could people not click it?
I have no experience in this area, and I am probably going to ask a dumb question: why not always submit the paper to arXiv before submitting it to a journal to protect against this kind of theft? Wouldn't that clearly and indisputably establish priority?
This is so unfair, some cheaters from China got free research papers to their name and the legit author had to waste two years of their life