How We Test AI Detectors

We regularly check the tools in our directory to make sure they work well. For each type of tool, we follow a set plan to test them fairly.

Disclaimer: Keep in mind that these tools are always getting better, so the results you see might not match exactly with ours. We cannot share every detail about our tests because someone could use that information to affect how the tools perform in our tests.

We use the latest AI models available to the public for all our tests. We test each tool three times and score it based on its performance in each test.

Scores are available under the "How Accurate Is This Tool?" tab on each tool's page.

Here are the scores:
★☆☆☆☆ - Bad: The tool misidentified AI content or mistakenly flagged human content as AI.
★★☆☆☆ - Fair: The tool's result was unsure (<60%). It might still work okay for easier tasks.
★★★☆☆ - Good: The tool worked well, but the result was not confidence-inspiring (<75%).
★★★★☆ - Very Good: The tool performed great (<90% accuracy), or excelled without a numerical score.
★★★★★ - Excellent: The tool excelled and gave a very confident result (>90%).

* Top scores are given only to tools that also provide a numerical value indicating their confidence.

Testing Specific Types of Tools

AI Content Detectors:
We test AI content detectors with three types of texts: human-written, AI-generated with some changes (mostly by prompts), and fully AI-generated. Each text is about 300 words long, which makes it harder for the tool to figure out.

Plagiarism Detectors:
We test plagiarism detectors using three texts about one page long. One is a human-written academic text with proper citations, another is similar but missing half of its citations, and the last one is completely AI-generated.

AI Image Detectors:
We test AI image detectors with three images: a photo taken by a human, a photo slightly altered by AI, and a completely AI-generated photo. Each photo includes a person.

AI Video Detectors:
We test AI video detectors with three short video clips (about 20 seconds each) with sound. The first is a raw human-filmed video, the second is a deepfake, and the third is a fully synthetic AI-generated video.

AI Voice Detectors:
For AI voice detectors, we use three audio clips (about 20 seconds each): an original voice recording of a well-known person, a modified version of this person's voice, and a completely synthetic voice. There's some background noise to mimic a real-life phone call.

AI Code Detectors:
We test AI code detectors with three versions of a simple application code. The first is written by a human, the second is a mix of human and AI input, and the third is fully AI-generated.

If you have valuable feedback regarding our testing protocol, feel free to send us an email at [email protected]. logo



Contact Us

© Copyright All rights reserved.