Readit News logoReadit News
muds commented on On the paper “Exploring the MIT Mathematics and EECS Curriculum Using LLMs” [pdf]   people.csail.mit.edu/asol... · Posted by u/jlaneve
ttpphd · 2 years ago
I think you missed the point that data needs to be collected and presented ethically. It's not about it being a work in progress and not peer reviewed.
muds · 2 years ago
I agree that the data collection process wasn't ethical, and the professor should definitely be reprimanded for that. It's extremely sad that the coauthors weren't aware of this as well. And I feel terrible for the undergrads: their first research experience was publically rebuked for no fault of their own.

However, there is no shortage of projects with sketchy data collection methodologies on arXiv that haven't received this amount of attention. The point of putting stuff on arXiv _is_ that the paper will not pass / has not passed peer review in its current form! I might even call arXiv a safe space to publish ideas. We all benefit from this: a lot of interesting papers are only available on arxiv v.s. being shared between specific labs.

I'm concerned that this fiasco was enabled by this new paradigm in AI social media reporting, where a project's findings are amplified and all the degrees of uncertainty are repressed. And I'm honestly not sure how to best deal with this other than either amplifying the uncertainty and jankyness in the paper itself to an annoyingly noticeable level, or just going back to the old way of privately sharing ideas.

Maybe this is the best case scenario for these sorts of papers? They pushed a paper on a public journal, and got a public "peer review" of the paper. Turns out the community voted "strong reject;" and it also turns out that the stakes for public rejection are (uncomfortably, IMO) higher than for a normal rejection. Maybe this causes the researchers to only publically release better research, or (more likely) this causes the researchers to privately release all future papers.

muds commented on On the paper “Exploring the MIT Mathematics and EECS Curriculum Using LLMs” [pdf]   people.csail.mit.edu/asol... · Posted by u/jlaneve
muds · 2 years ago
Putting papers and code on arXiv shouldn't be punished. The incentive to do this is to protect your idea from getting scooped, and also to inform your close community on interesting problems that you're working on and get feedback. ArXiv is meant for work in progress ideas that won't necessarily stand the peer review process, but this isn't really acknowledged properly on social media. I highly doubt the Twitter storm would have been this intense if the twitter posts explicitly acknowledged this as a "Draft publication which hints as X." But I admit that pointing fingers at nobody in general and social media specifically is a pretty lazy solution.

The takeaway IMO seems to be to prepend the abstract with a clear disclaimer sentence conveying the uncertainty of the research in question. For instance, adding a clear "WORKING DRAFT: ..." in the abstract section.

Deleted Comment

muds commented on No, GPT4 Can’t Ace MIT   flower-nutria-41d.notion.... · Posted by u/YeGoblynQueenne
iudqnolq · 2 years ago
ImageNet has five orders of magnitude more answers, which I would assume makes QA a completely different category of problem.

The authors could probably have carefully review all ~300 of their questions. If they couldn't they could have just reduced their sample size to say 50.

muds · 2 years ago
I admit that Imagenet isn't the best analogy here. But I'm pretty confident that this data cleaning issue would be caught in peer review. The biggest issue which I still don't understand was the removal of the test set. That was bad practice on the authors' part.
muds commented on No, GPT4 Can’t Ace MIT   flower-nutria-41d.notion.... · Posted by u/YeGoblynQueenne
muds · 2 years ago
I'm not sure what to make of this post. There is always a degree of uncertainty with the experimental design and it's not surprising that there are a couple of buggy questions. Imagenet (one of the most famous CV datases) at this point is known to have many such buggy answers. What is surprising is the hearsay that plays out on social media that blows the proportion of the results out of the water and leads to opinion pieces like these targeting the authors instead.

Most of the damning claims in the conclusion section (Obligatory: I haven't read the paper entirely, just skimmed it.) usually get ironed out in the final deadline run by the advisors anyway. I'm assuming this is a draft paper for the EMNLP deadline this coming Friday published on arxiv. So this paper hasn't even gone through the peer review process yet.

Deleted Comment

muds commented on Sile: A Modern Rewrite of TeX   sile-typesetter.org/... · Posted by u/signa11
daly · 3 years ago
TeX and Literate Programming (and Lisp) are my fundamental, day-to-day tools.

Code it, explain it, generate a Literate PDF containing the code.

The programming cycle is simple. Write the code in a latex block. Run make. The makefile extracts the code from the latex, compiles it, runs the test cases (also in the latex), and regenerates the PDF. Code and explanations are always up to date and in sync.

I have found no better toolset.

muds · 3 years ago
I've been struggling with keeping track of research experiments and code at the same time. This seems pretty cool! I like how this method is language agnostic and uses "matured" tools. Question: I'd love to give this a try; do you have any public code snippets?
muds commented on Test scores are not irrelevant   dynomight.net/are-tests-i... · Posted by u/colinprince
ramraj07 · 3 years ago
Uhm, what exactly are you trying to get at? I said subject GRE is a very good measure of eventual success in academia, do you have a solid response to that or just a rambling tirade?

Paper authorship if the student is the first author shows grit and “gumption” I suppose? As if that’s what’s needed in academia at this moment (it’s important but not the main requirement). But almost no undergrad gets a first author paper. They get mentioned in the middle because they ran a bunch of sds gels. I wasn’t even interested in trying to become a professor and I got 10 papers before I finished my PhD, do you know how many I (or any of the folks I actually know who are now professors) had during our undergrad? Zero. And not for lack of trying. You know who actually got papers? The son of the department head.

muds · 3 years ago
The original comment, to me, reads more like "subject GRE is a definitive measure of eventual success in academia." I was arguing against the definitive part. Thanks for the clarification. Maybe it might be a good measure for your cohort, you, and people in similar situations.

> Almost no undergrad gets a first author paper

Maybe this is different in different fields but we have a lot of undergraduate first author papers in programming languages and machine learning. I mean -- through and through -- undergraduate students bringing up a topic, getting guidance from professors and senior PhD students, getting results by the end of the semester, and publishing results by next year. Even the people who end up "running the sds slides" either fall out by next year or end up working towards their own first author publications. I've always chalked this up to the experimental setup cost being very cheap in CS compared to in the "hard sciences" so most undergraduate students are already comfortable with all the tools they need to do research.

> I wasn't interested in trying to become a professor

I think this is precisely the variable that a standardized test cannot account for! I feel an "authentic" undergraduate research experience is successful if it helps students realize if research is right for them or not.

> ... papers ... the son of the department head...

I see where your frustration is stemming from. Sorry this was your first experience with undergraduate research.

muds commented on Test scores are not irrelevant   dynomight.net/are-tests-i... · Posted by u/colinprince
ramraj07 · 3 years ago
So one of the best standardized tests I’ve ever taken is the Subject GRE. When I was applying for PhD programmes I and a few of my colleagues took it. It was perfect - it tested both knowledge and critical thinking skills and experimental design. It was obscure so no material was available for specific training.

In the end the scores were so accurate in predicting eventual success - a decade later, the folks who got 96% plus are all either professors or deliberately chose not to, while the rest just took industry jobs (and in my opinion because they realized that’s best for them).

Here’s the kicker - every institution explicitly said they will NOT consider these scores as part of admission process. None of the people who got in the top percentiles (4 I know) made it to a top 10 institution in the US, while a bunch of others did because they had paper authorship in their undergrad. But paper authorship in undergrad has NO correlation with your actual scientific skills! It just meant you were connected and or a hustler. It’s sad that such an awesome test is deliberately ignored by these institutions.

The irony was that these tests were also not elitist. All you needed to do was thoroughly read Lodish and Lehninger and you’re good. We did study in a semi-premium institution in India, but by no means were we privileged by any special facilities or help (at least in the context of this test). The only barrier might be the test fee itself if anything.

muds · 3 years ago
Respectfully, I call bullshit. You can't quantify success in a PhD with a single variable. There are a billion ways research can go right or wrong -- irrespective of your personal pedigree. Your ideas might be too early or late for your community to grasp, maybe you appeal to the wrong audience, maybe you're unaware of an application of your research, maybe you don't have the right set of collaborators or need a perspective that, often times, emerges out of a lucky encounters with someone. I respect your experience but I don't want it to give people the wrong idea about research success...

> Paper authorship had NO correlation with research success

I think paper authorship demonstrates that you're willing to put in a non-trivial amount of work to persue a problem. That seems to be atleast one attractive skill in a PhD, wouldn't you agree?

u/muds

KarmaCake day448May 22, 2019
About
Working on program synthesis + machine learning. Hottakes are my own.
View Original