We regularly check the tools in our directory to make sure they work well. For each type of tool, we follow a set plan to test them fairly.
Disclaimer: Keep in mind that these tools are always getting better, so the results you see might not match exactly with ours. We cannot share every detail about our tests because someone could use that information to affect how the tools perform in our tests.
We use the latest AI models available to the public for all our tests. We test each tool three times and score it based on its performance in each test.
Scores are available under the "How Accurate Is This Tool?" tab on each tool's page.
Here are the scores:
★☆☆☆☆ - Bad: The tool misidentified AI content or mistakenly flagged human content as AI.
★★☆☆☆ - Fair: The tool's result was unsure (20 - 60%). It might still work okay for easier tasks.
★★★☆☆ - Good: The tool worked well, but the result was not confidence-inspiring (61 - 75%).
★★★★☆ - Very Good: The tool performed great (76 - 90% accuracy), or excelled without a numerical score.
★★★★★ - Excellent: The tool excelled and gave a very confident result (>90%).
For example: a tool achieving 82% accuracy would be considered very good (★★★★☆).
* Top scores are given only to tools that also provide a numerical value indicating their confidence.
* If a tool gets one star in any category, the overall score is rounded down.
* If a tool receives two one-star ratings, the overall score will automatically be one star.
* Plagiarism detectors are rated on different bases, as covered in the respective section below.
AI Content Detectors:
We test AI content detectors with three types of texts: human-written, AI-generated with some changes (mostly by prompts), and fully AI-generated. Each text is about 300 words long, which makes it harder for the tool to figure out.
Plagiarism Detectors:
We test plagiarism detectors using three texts, each about one A4 long. One is a human-written academic text with proper citations (i.e., original text). The second text is fully copied from the web (different sources), partially rephrased by AI, and missing its citations. The last one is fully copied text from various sources, not modified, and not cited.
Here are the scores:
★☆☆☆☆ - Bad: Plagiarized text: 0 - 20% of plagiarism detected. Original text: <85% originality.
★★☆☆☆ - Fair: Plagiarized text: 21 - 40% of plagiarism detected. Original text: >85% originality.
★★★☆☆ - Good: Plagiarized text: 41 - 60% of plagiarism detected. Original text: >90% originality.
★★★★☆ - Very Good: Plagiarized text: 61 - 80% of plagiarism detected. Original text: >95% originality.
★★★★★ - Excellent: Plagiarized text: 81 - 100% of plagiarism detected. Original text: 100% originality.
For example: a tool detecting 52% plagiarism would be considered good (★★★☆☆).
* Top scores are given only to tools that also provide a numerical value.
AI Image Detectors:
We test AI image detectors with three images: a photo taken by a human, a photo slightly altered by AI, and a completely AI-generated photo. Each photo includes a person. For AI art detectors, we use the same logic; however, we use images including brushed paintings.
AI Video Detectors:
We test AI video detectors with three short video clips (about 20 seconds each) with sound. The first is a raw human-filmed video, the second is a deepfake, and the third is a fully synthetic AI-generated video.
AI Voice Detectors:
For AI voice detectors, we use three audio clips (about 20 seconds each): an original voice recording of a well-known person, a modified version of this person's voice, and a completely synthetic voice. There's some background noise to mimic a real-life phone call.
AI Code Detectors:
We test AI code detectors with three versions of a simple application code. The first is written by a human, the second is a mix of human and AI input, and the third is fully AI-generated.
If you have valuable feedback regarding our testing protocol, feel free to send us an email at info@detectortools.ai.
© Copyright DetectorTools.ai. All rights reserved.