By Stealth Team
- AI Detection
A New Detection LevelOn April 9th, Originality.ai published a blog detailing their latest update. They claimed better performance than OpenAI, GPT Zero, Copy Leaks, and more. They boasted an accuracy level of 95.93% with a false positive score of less than 2.5%. This announcement sent shockwaves through the detection and anti-detection space. No doubt, the StealthGPT team and many other teams went to work to see if it was all it was cracked up to be. However, the Originality detection tool is completely unreliable and a total liability for any serious institution to run.
Initial ImpressionsWhen we initially ran our responses against Originality, we were shocked. They seemed to catch every single AI response we tested. We generated text from competitor services to test, even ones that claimed they could beat all AI detection tools, and those were all flagged as AI as well. Our whole team was flabbergasted as we tried to understand how they were detecting so well.
False PositivesHowever, our team was determined to crack this. To our surprise, once we stared inputting real human text into the detection tool.. we found an overwhelming amount was being falsely-detected as AI generated. Enough to show the service was severely flawed.Entire passages from the Bible were flagged as AI, as well as essays that predate ChatGPT. We tested many real, human samples and received at best mixed results showing some human and AI detected text. At worst, 100% scores of AI-generated text.
Liability, Risks, and ConsequencesNow it is quite obvious to us and anyone who has put Originality through rigorous testing that their update is incredibly flawed and overly sensitive at ‘detecting AI text’. It was highly irresponsible of them to release this update. The false positive rate is so bad that only a few days after the CEO announced the update, he emailed his subscribers discussing the false positive issue, but claims that false positive rate is only 2.5%. However, with no data or evidence presented, I have to assume the 2.5% rate is fabricated. The entire blog post’s source is “trust me bro” with no real testing data presented. The real false positive rate is likely much higher, our team found.
Needless to say, using Originality.ai as a business or educational institution poses a significant liability risk. To our knowledge, they have yet to be implemented by any major institutions. Imagine the legal disaster that would follow if an educational facility falsely claimed that a piece of work was written by AI, due to Originality.ai’s service. The burden of proof and the standard of evidence for AI detection is on the AI detecting service and the institutions making accusations, not on the individual being accused. Therefore, it is incredibly important for institutions and AI Detectors to seek services that have unquestionable precision in their detection.
Recommendations and ConclusionWe call on Originality.ai to make clear to the public that their AI Detection service in its current form is not sufficiently usable in any institutional setting. We should all demand them to make their methodologies and testing material public so that others may test the integrity and accuracy of their service.
As for StealthGPT, we will continue to test against Originality and see if there is a way to avoid detection. However, it is our consensus that Originality is so sensitive to all forms of content, both AI and human, that it is completely unusable in any professional setting.
TLDR; While Originality.ai's claims of superior performance and accuracy may have initially raised concerns among the AI detection and anti-detection communities, our findings suggest that their service is far from reliable. With an alarmingly high false positive rate and lack of transparency in their testing methodology, Originality.ai currently poses a significant liability risk for any institution considering its implementation.
We hope that Originality.ai addresses these issues promptly, and we encourage them to be more transparent about their methods and results. In the meantime, we advise businesses and educational institutions to exercise caution when evaluating AI detection services and to prioritize those that demonstrate a proven track record of precision, reliability, and transparency.