Catching ChatGPT cheaters is tough: Is it bursty?
Open AI's program to catch chatbot cheaters is very unreliable, write Armin Alimardani and Emma A. Jane on The Conversation.
The company admits that its classifier for indicating AI-written text "accurately identifies only 26% of AI-generated text (true positive) while incorrectly labelling human prose as AI-generated 9% of the time (false positive)," they write.
Edward Tian, a Princeton computer science major, came up with a more promising option called GPTZero on his winter break. His app analyzes "perplexity" (complexity) and "burstiness" (the variation between sentences) to identify AI authorship. Bots are lower on both than humans.
A would-be cheater can go online to find tools that try to mislead AI classifiers by replacing words with synonyms, Alimardani and Jane write. But the synonyms can be “tortured.” For example, it's a red flag when "big data" becomes "colossal information."
The two professors asked ChatGPT to write an essay on justice, then copied it into GPT-Minus, which offered to “scramble” ChatGPT text with synonyms. It changed 14 percent of words, turning the essay into gibberish.
ChatGPT: "Justice is a cornerstone of the rule of law and is fundamental to the preservation of social order and the protection of individual rights . . . Criminal justice involves the fair and impartial enforcement of laws . . . "
GPT-Minus: "Justness is a cornerstone of the rule of practice of law and is first harmonic to the saving of social order and the tribute of soul rights . . . Outlaw justice involves the funfair and colorblind undefined of Torah . . . "
I remember when I got my first copy of Roget's Thesaurus. I went a little crazy (berserk, bonkers, demented, delirious) too, but then I calmed down.
When they copied the GPT-Minus1 version of the justice essay back into GPTZero, it concluded the "text is most likely human written but there are some sentences with low perplexities."
Another proposal is for AI-written text to contain a “watermark” that can be picked up by software. This works by limiting the words the AI can use. Humans are likely to use some banned words, showing that the text wasn't generated by AI. This imperfect too, write Alimardani and Jane.
"AI-generated text detectors will become increasingly sophisticated," they write. "Anti-plagiarism service TurnItIn recently announced a forthcoming AI writing detector with a claimed 97% accuracy."
But the text generators are improving too. "As this arms race continues, we may see the rise of 'contract paraphrasing.' Rather than paying someone to write your assignment, you pay someone to rework your AI-generated assignment to get it past the detectors."
Or learn to write? I guess not.
Chatbot cheating is soaring, writes Vishwam Sankaran of the Independent. In a Study.com survey of 203 teachers, 26 percent had caught a student cheating using ChatGPT. It's only been out for a few months.