arXiv:2511.09200v1 Announce Type: new 
Abstract: Large language models are increasingly used for many applications. To prevent illicit use, it is desirable to be able to detect AI-generated text. Training and evaluation of such detectors critically depend on suitable benchmark datasets. Several groups took on the tedious work of collecting, curating, and publishing large and diverse datasets for this task. However, it remains an open challenge to ensure high quality in all relevant aspects of such a dataset. For example, the DetectRL benchmark exhibits relatively simple patterns of AI-generation in 98.5% of the Claude-LLM data. These patterns may include introductory words such as "Sure! Here is the academic article abstract:", or instances where the LLM rejects the prompted task. In this work, we demonstrate that detectors trained on such data use such patterns as shortcuts, which facilitates spoofing attacks on the trained detectors. We consequently reprocessed the DetectRL dataset with several cleansing operations. Experiments show that such data cleansing makes direct attacks more difficult. The reprocessed dataset is publicly available.

تسلط ورقة جديدة بعنوان 'التلوث في معايير الكشف عن النصوص المولدة' الضوء على مشكلات في معيار DetectRL، الذي يظهر أنماطًا بسيطة من توليد الذكاء الاصطناعي في 98.5% من بيانات Claude-LLM. يسمح هذا التلوث للكاشفات بأن تكون سهلة التلاعب. أعاد المؤلفون معالجة مجموعة البيانات لتحسين جودتها، مما يجعلها أكثر قوة ضد هجمات التلاعب، وهو أمر حاسم مع تزايد استخدام نماذج اللغة الكبيرة في تطبيقات متنوعة.

Un nuevo artículo titulado 'Contaminación en los benchmarks de detección de texto generado' destaca problemas con el benchmark DetectRL, que muestra patrones simples de generación de IA en el 98.5% de los datos de Claude-LLM. Esta contaminación permite que los detectores sean fácilmente engañados. Los autores reprocesaron el conjunto de datos para mejorar su calidad, haciéndolo más robusto contra ataques de suplantación, lo cual es crucial a medida que los modelos de lenguaje se vuelven más comunes en diversas aplicaciones.

Un nouvel article intitulé 'Contamination dans les benchmarks de détection de texte généré' met en lumière des problèmes avec le benchmark DetectRL, qui montre des motifs simples de génération d'IA dans 98,5 % des données de Claude-LLM. Cette contamination permet aux détecteurs d'être facilement contournés. Les auteurs ont retraité le jeu de données pour améliorer sa qualité, le rendant plus robuste contre les attaques de contournement, ce qui est crucial alors que les modèles de langage deviennent plus répandus dans diverses applications.

A new paper titled 'Contamination in Generated Text Detection Benchmarks' highlights issues with the DetectRL benchmark, which shows simple AI-generation patterns in 98.5% of Claude-LLM data. This contamination allows detectors to be easily spoofed. The authors reprocessed the dataset to improve its quality, making it more robust against spoofing attacks, which is crucial as large language models become more prevalent in various applications.

Sure! Here's a short and concise title for your paper: "Contamination in Generated Text Detection Benchmarks"

Was this article worth reading? Share it

Ready to build your own newsroom?