AI "Detection" Tools - No Silver Bullets

AI "Detection" Tools - No Silver Bullets

Most of our work is with public libraries and consortia but we do occasionally consult with academic and K-12 library sytems on AI-related stuff. (For the record, I'll never claim to be a pedagogical expert, and I know LLMs can make an educator's job tougher in a lot of ways. So I try not to get involved in any "should schools use AI" discussions!)

As part of this, I'm frequently asked about AI detection tools. My short answer is that they are not reliable enough to be trusted. There is no silver bullet for being able to detect AI-written text 100% of the time without false positives ("a human really did write this!") or false negatives ("LLM-generated content passes for human-generated").

Now, obviously we can tell sometimes, right? People often leave in obvious tells, like including the typical LLM "follow-up" query as part of the text. A creative writing sample that ends with "Let me know if you'd like me to make any changes to this piece!" is hard to explain as anything other than obvious LLM usage. And of course sometimes you can detect LLM usage just from the overall content and context: if a 5th-grader who struggles to read Captain Underpants books starts turning in "Young Sheldon"-level science projects, it's worth a closer look.

For any gray areas, though, where the writing is plausibly human, I'd hesitate to believe any AI detector over the supposed writer.

One of the many guides that Wikipedia provides for its volunteer editors is this super comprehensive list of tips and tactics for detecting LLM-generated text.

Wikipedia:Signs of AI writing - Wikipedia

There is a bit in here I find fascinating:

💡
Research shows that people who use LLMs heavily themselves can correctly determine whether an article was generated by AI about 90% of the time, which means that if you are an expert user of LLMs and you tag 10 pages as being AI-generated, you've probably falsely accused one editor. People who don't personally use LLMs much do only slightly better than random chance (in both directions) for identifying AI-generated articles.

This is citing this paper, preprint here:

People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text

So I guess the best way to reliably detect LLM usage in writing is to regularly use LLMs to write? Hmmm...