False Positives, Real Harm - Stop Using AI Detectors

False Positives, Real Harm - Stop Using AI Detectors

author

Pablo De Lucca

20 Jun 2025 - 06 Mins read

Key takeaways
  • AI detectors are unreliable, producing unacceptably high rates of false positives and negatives.
  • Detectors harm students, creating distrust and exhibiting bias against vulnerable groups.
  • Embrace transparency and teach AI literacy instead of policing students.

Introduction

The emergence of generative AI has fundamentally disrupted academic assessment. Tools like ChatGPT can now produce essays and solve problems with unprecedented speed and quality, forcing educators to confront a critical question: How can we foster academic honesty and critical thinking when every student has easy access to generative AI?

In the rush to police academic integrity, many institutions turned to AI detection software to try to identify machine-generated text. This reaction, however, was based on a flawed premise - as experts at MIT have bluntly put it: AI detectors don't work. The continued reliance on this faulty technology, often driven by a misunderstanding of its inner workings, creates significant risks for students and undermines the very goals of education. This article will therefore provide a focused analysis: we will break down how AI detectors function, how they fail, and detail the negative impact on students and the broader academic community. Moving beyond critique, we will also outline a forward-thinking approach based on transparency, reframing the challenge of AI as an opportunity for pedagogical innovation.

Table of contents

What are AI detectors?

AI detectors are tools designed to differentiate between human and machine-generated content by analyzing writing style, structure, and linguistic patterns. However, their core function is not to provide proof, but to estimate the likelihood that a text was written by a Large Language Model (LLM) like ChatGPT.

The key output of any AI detector is a probability score—a guess based on statistical patterns. This is a crucial point of frequent misunderstanding. If a tool flags a paper as "80% likely AI-generated," it is not claiming that 80% of the content was written by an LLM. Rather, it is estimating an 80% chance that the entire piece of text came from a machine. It's a verdict on the whole, not a measure of its parts, and it remains an educated guess at best.

How AI detection works

At their core, most AI detectors are a type of AI model themselves. Specifically, they are machine learning classification models. They are trained on vast datasets of human-written and AI-written text and learn to spot the statistical differences in writing style, structure, and linguistic patterns.

Companies that build AI detectors such as GPTZero, Grammarly, or Scribbr often say that their models look at features like perplexity and burstiness - measures of how predictable and varied the text is - to determine the likelihood of a text being AI-generated. While traditionally, these measures were indeed used and actually gave some visbility into the decision making-process of these tools, they are now largely obsolete. Nowadays, the data is fed to the model and the model creates internal representations only it understands to make its decision.

This process makes AI detectors an inherent "black box". Their internal decision-making is entirely opaque. They don't explain why they think a piece of text is AI-generated, they just output a probability score.

How AI detection fails

There are two ways in which AI detectors can fail:

  • False Negative: This refers to when the detector fails to identify documents that were AI-generated. This failure mode highlights the tool's ineffectiveness, as users can easily bypass detection. A comprehensive 2024 study from the University of Pennsylvania found that detectors are "easily fooled" by simple techniques like minor manual edits, paraphrasing, or using other AI tools to "humanize" the text. People who understand how detectors work can cirumvent detectors "80% to 90% of the time". This vulnerability renders them useless for reliably catching academic misconduct.

  • False Positives: This is the more dangerous failure, where a student's original work is wrongly flagged as AI-generated. The consequences can be devastating, leading to accusations of dishonesty, failing grades, disciplinary action, and significant emotional distress. False positive rates vary. For example, Turnitin has stated that its detector has a false positive rate of under 1%, but later recanted this statement without disclosing the exact rate.

The impact of AI detection on students

format_quote

To see that I was being accused of using AI when I knew in my heart I didn’t, it was really, really stressful, because I had no idea how to even prove my innocence.

Maggie Seabolt, student at Liberty University

Relying on detectors creates an atmosphere of suspicion and distrust. As one professor wrote, it "kills the trust that any teaching relationship depends on." Students are forced to defend their own work against the verdict of a black-box algorithm, a stressful and often demoralizing experience. Visit any Reddit thread about AI detection to see how students feel about it. Research has highlighted the significant psychological and material impacts of false accusations on students, undermining their confidence and their relationship with the institution.

Furthermore, AI detection poses a disproportionate impact on vulnerable students. The most damning indictment of AI detectors is their inherent bias, they disproportionately penalize those who don't write in a way that deviates from what the model considers "normal". Concretely, AI detectors are more likely to flag content that is written by:

  • Non-native English speakers: A Stanford study found detectors flagged over half of essays written by non-native speakers as AI-generated. As one Berkeley article argues, this traps students: they are policed for writing differently, but also for using tools that could help them bridge the language gap.
  • Neurodivergent students: As Bloomberg reported, due to their writing styles which may not conform to the patterns the detectors expect, they are often accused of using AI to cheat.
  • Black students: They are more than twice as likely as their white peers to be falsely accused of using AI, according to a recent report from Common Sense Media.

Widespread institutional retreat

The unreliability and ethical issues of AI detectors have led to a widespread institutional retreat in the US. A growing list of universities, including MIT, Yale, Northwestern, and the University of Pittsburgh, have either banned or strongly recommended against the use of these tools.

The University of Pittsburgh's Teaching Center stated it could not endorse the tools due to the "substantial risk of false positives and the consequential issues such accusations imply." This sentiment is echoed across academia. The message from leading institutions is clear: the risk of falsely accusing an innocent student is too high, and the tools are too unreliable to be a basis for academic judgment.

Across the Atlantic it is also understood that the output of these tools cannot be trusted, alhough no decisions have been publicly made to ban or restrict their use.

How do I deal with AI then?

AI is not going away, banning it is unsustainable, detecting it is a failed strategy, and it is being abused by students in ways that go against the entire academic project. And at the same time, AI is set to become the most important technology of our generation and it is likely to be an integral part of any future job. The challenges for education are more pressing than ever. So how should academic institutions deal with AI?

Rethinking assessment for the age of AI

The first step is to reframe AI from a threat to a tool. Used correctly, AI can be a powerful partner for learning—a Socratic tutor that helps students brainstorm, a patient assistant that explains complex concepts, and a tool that improves accessibility. The goal shouldn’t be to prevent students from using AI, but to teach them how to use it responsibly, ethically, and effectively.

This requires a fundamental shift in how we assess learning. For decades, the final essay served as a clear window into a student's thinking. However, when a machine can generate a polished essay in seconds, the essay itself ceases to be a reliable measure of a student's effort or understanding. Our focus must therefore pivot from the final, polished product to the messy, insightful process of its creation. We need to see the journey, not just the destination.

This is the philosophy that drives us at DidactLabs. We believe the solution to the AI dilemma isn’t detection, but transparency.

Instead of trying to estimate if students are using AI, our platform shows you how and why they are using it. We provide a transparent writing environment where educators can see the entire process: what a student writes themselves, what they paste from outside sources, and how they interact with their integrated AI assistant. This visibility restores integrity to the assessment process and, more importantly, transforms AI from a cheating device into a teachable tool.

By focusing on the process, we empower educators to guide students toward the ethical and productive uses of AI, fostering the critical thinking and AI literacy skills they will need for the future. It’s time to move beyond the flawed promise of detection and embrace a new social contract for AI in the classroom - one built on trust, transparency, and a shared commitment to genuine learning.

Take control of AI in your classroom today

Join the growing community of educators preparing students for an AI-powered future while maintaining academic integrity.
Start in minutes, for free.