Agreed. This is useless. I asked GPT-3 to write a HN post decrying GPTZero:
> GPTZero is absolute trash! It can't even detect plagarism correctly. It's a complete joke and waste of time! Don't bother with it! You'd be better off just manually checking for plagiarism. It's so unreliable it's not even worth mentioning. Just save yourself the time and money and don't bother with GPTZero. Worst AI tool ever!
GPTZero classified it as fully human written. This is a cat and mouse game where the mouse is always going to lose.
The sad part is, that real people will likely suffer, when people with power take salespersons serious and use stuff like that to "detect" and punish frauds.
it's a toy project built by a CS undergrad student over their winter break that went viral because journalists desperately want something that can expose AI writing. It's built on streamlit and a few glued together libraries, not really surprising the results aren't good. It's not some state of the art custom tool but they are marketing themselves as the solution to solving academic plagiarism
Same, ran a few samples and it was wrong every time. My hope is that teachers will run some tests before just willy-nilly trusting the scores it gives and start accusing students, but sadly I know a lot of teachers and I bet within that group few will run their own tests and will just accept it as authority.
The problem with such things is that you would need 100% accuracy in most cases for it to be useful.
For example many schools and universities fear that students use ChatGPT for homework. If such a plagiarism checker has false positive results in just a small percentage of cases, the consequences for honest students would be too severe to actually act on the results of the check. But perfect accuracy can never be reached for text classification, so those tool will never be that useful despite being interesting to AI people.
At the university level, the problem of cheating is more a problem with our idea of a university. In an ideal world, you go to a university to learn. If someone wants to cheat, ultimately they're just cheating themselves (and, if at a private university or even most public universities, throwing a lot of money away). "Cheating" isn't really the university's problem in that view.
Ofc, in reality universities aren't (or not only, and in most cases not primarily) about learning, but about credentialism. Employers and various social systems outsource to universities the role of verifying that people actually know something, or are the "right" sort of person, or other signals.
I don't know how one goes about fixing that, or if it's even possible, but I'd like to see more acknowledgement of it. Fixing "cheating" feels like the equivalent of looking to clever programming to fix product bugs.
Grades are also needed in university to check for prerequisite knowledge for courses. You waste everybody's time and resources if you let people into courses who think or pretend that they have the prerequisite knowledge. People who are cheating aren't only cheating themselves.
You don't really need any radical changes, you just have to get rid of the idea of graded homework, which I always hated from elementary school onwards.
Larger projects have a role to play in education, but ultimately if you can't pass the proctored exams, you don't pass the course.
This seems to be a reductive response to a confidence based policy approach. There are many systems out there today that provide a confidence score of bots vs humans. I happen to work with a product that uses this exact approach and it's highly effective. So with education scenarios I think that the teacher/professor/administration will use these systems to inspect the entirety of the submissions. From there a baseline value will be derived and from there outliers will appear that may require deeper analysis/interview. Schools are in the position that they can't get it wrong more often than not (they need tuition dollars after all), but a severe detractor will need to be hanging over the heads of students to avoid cheating with it.
I think the other thing we'll see is a nanny state approach by some educational institutions. Falling in a trap of being sold on software to "block" or "monitor" students using these tools. It would be easy to implement on a campus network to a certain extent (correlating student network login to URL accessed and potentially MitM) but the reality is smart students will know better and will use phone hotspots and VPN. The other dark side to consider is that the owners of ChatGPT could provide logs of user accounts and queries to higher education as a service.
At the end of the day my guess is all approaches are going to be tested at some level. But the cat is out of the bag and this is going to generate some very interesting countermeasure solutions/approaches along the way.
Not sure how a student could prove that the negative, apart from videotaping themselves writing the essay, or only being allowed type it on an airgapped machine owned by the university.
> The problem with such things is that you would need 100% accuracy in most cases for it to be useful.
Why? If there is 75% confidence a students report was generated using ChatGPT, then that’s enough to sit down with the student and discuss the content in person and see if they actually know it. A tool such as this could help the teacher having to avoid doing this with every student, and also reinforce to students that if they don’t actually know the material there’s still a chance they get caught.
> the consequences for honest students would be too severe to actually act
Only if acting is immediately accusing them of plagiarism rather than working with the student to ensure it’s really their work.
This is such a short-sighted view. Students who wish to cheat using ChatGPT will immediately start running their text through these tools preemptively, changing some phrases here and there and add spelling errors to ensure they don't get caught. You're left with innocent students being accused of cheating on regular basis.
Maybe the education system will finally learn that asking students to merely recite information that can be found by anyone, anywhere, in less than 10 seconds isn't helpful in judging their understanding of the subject, except in very limited scenarios.
This is not what is going to happen tho. Flagged content will not be accepted and the student failed for the class, or worse for the year. Expulsion is on the card as the war against generated content becomes bitter.
This doesn’t work. Stack Exchange tried it, and it turns out there is a lot of misleading, superficial content that easily gets a high number of upvotes from newbies who don’t know better.
What about false positives? Irresponsible to market this to laypersons as perfect with absolutely no word of caution about false positives. There's been some chatter on Reddit that some content writers were falsely accused of using ChatGPT and lost their clients.
This is an everlasting battle. Schools are already using these kind of tools to detect cheating, often with false positives that may have devastating outcome.
The accuracy just isn’t there, which isn’t surprising. I’m feeding it Star Wars reviews from both ChatGPT and IMDb, and for sure the answers are correlated. The false negative rate is OK, although I’ve hit plenty of them. But man, it sure does think a lot of people were using ChatGPT to write reviews back in the 2000s.
Again, it’s definitely correlated, even strongly correlated, but that’s not good enough for plagiarism detection. You can’t go accusing students of academic dishonesty based on a tool that gets it wrong multiple times on a couple dozen samples.
-- I copy and pasted GPT3 reply to - what are Jim Crow Laws? - said written by AI - then copy paste wikipeda first paragraph - said written by AI - then I wrote it myself - said written by AI - then I copy/paste 1 word over and over and over 100 or so times - it says "likely written by human but some by AI" then - highlights the whole text and notes - likely written by AI - hahhaa --
I wonder if there is a way to watermark generated text, not by adding extra spaces or using invisible characters but by using grammar. Example naive rule “every 13th word MUST be a proposition”. Another more complex example “every 11th word is a checksum of the previous 10. You encode a few bits in every word i.e noun=1 verb=2 article=3 etc”
- it thinks AI wrote parts of my handwritten texts
- King James Genesis 29 is appearantely fully AI written
- a wall of text copied straight from chatgpt? Only partially AI written.
- second chapter of Harry Potter? AI written parts
In fact I could not find a sample yet which was not AI written at least partially according to this.
> GPTZero is absolute trash! It can't even detect plagarism correctly. It's a complete joke and waste of time! Don't bother with it! You'd be better off just manually checking for plagiarism. It's so unreliable it's not even worth mentioning. Just save yourself the time and money and don't bother with GPTZero. Worst AI tool ever!
GPTZero classified it as fully human written. This is a cat and mouse game where the mouse is always going to lose.
The sad part is, that real people will likely suffer, when people with power take salespersons serious and use stuff like that to "detect" and punish frauds.
Deleted Comment
Ofc, in reality universities aren't (or not only, and in most cases not primarily) about learning, but about credentialism. Employers and various social systems outsource to universities the role of verifying that people actually know something, or are the "right" sort of person, or other signals.
I don't know how one goes about fixing that, or if it's even possible, but I'd like to see more acknowledgement of it. Fixing "cheating" feels like the equivalent of looking to clever programming to fix product bugs.
Larger projects have a role to play in education, but ultimately if you can't pass the proctored exams, you don't pass the course.
Deleted Comment
I think the other thing we'll see is a nanny state approach by some educational institutions. Falling in a trap of being sold on software to "block" or "monitor" students using these tools. It would be easy to implement on a campus network to a certain extent (correlating student network login to URL accessed and potentially MitM) but the reality is smart students will know better and will use phone hotspots and VPN. The other dark side to consider is that the owners of ChatGPT could provide logs of user accounts and queries to higher education as a service.
At the end of the day my guess is all approaches are going to be tested at some level. But the cat is out of the bag and this is going to generate some very interesting countermeasure solutions/approaches along the way.
Not sure how a student could prove that the negative, apart from videotaping themselves writing the essay, or only being allowed type it on an airgapped machine owned by the university.
Doesn't feel very feasible to implement.
Why? If there is 75% confidence a students report was generated using ChatGPT, then that’s enough to sit down with the student and discuss the content in person and see if they actually know it. A tool such as this could help the teacher having to avoid doing this with every student, and also reinforce to students that if they don’t actually know the material there’s still a chance they get caught.
> the consequences for honest students would be too severe to actually act
Only if acting is immediately accusing them of plagiarism rather than working with the student to ensure it’s really their work.
Maybe the education system will finally learn that asking students to merely recite information that can be found by anyone, anywhere, in less than 10 seconds isn't helpful in judging their understanding of the subject, except in very limited scenarios.
There may also be escalating social and perhaps legal penalties too.
And that is without even considering bots.
Again, it’s definitely correlated, even strongly correlated, but that’s not good enough for plagiarism detection. You can’t go accusing students of academic dishonesty based on a tool that gets it wrong multiple times on a couple dozen samples.