Which LLMs are Difficult to Detect? A Detailed Analysis of Potential Factors Contributing to Difficulties in LLM Text Detection

stp2yDecember 12, 20240 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 18 Oct 2024 (v1), last revised 10 Dec 2024 (this version, v2)]

View a PDF of the paper titled Which LLMs are Difficult to Detect? A Detailed Analysis of Potential Factors Contributing to Difficulties in LLM Text Detection, by Shantanu Thorat and Tianbao Yang

View PDF
HTML (experimental)

Abstract:As LLMs increase in accessibility, LLM-generated texts have proliferated across several fields, such as scientific, academic, and creative writing. However, LLMs are not created equally; they may have different architectures and training datasets. Thus, some LLMs may be more challenging to detect than others. Using two datasets spanning four total writing domains, we train AI-generated (AIG) text classifiers using the LibAUC library – a deep learning library for training classifiers with imbalanced datasets. Our results in the Deepfake Text dataset show that AIG-text detection varies across domains, with scientific writing being relatively challenging. In the Rewritten Ivy Panda (RIP) dataset focusing on student essays, we find that the OpenAI family of LLMs was substantially difficult for our classifiers to distinguish from human texts. Additionally, we explore possible factors that could explain the difficulties in detecting OpenAI-generated texts.

Submission history

From: Shantanu Thorat [view email]
[v1]
Fri, 18 Oct 2024 21:42:37 UTC (1,211 KB)
[v2]
Tue, 10 Dec 2024 15:44:59 UTC (1,211 KB)

Source link
lol

By stp2y