EditLens: Quantifying the Extent of AI Editing in Text
Katherine Thai, Bradley Emi, Elyas Masrour, Mohit Iyyer
TL;DR
EditLens addresses the need to quantify AI involvement in text editing on a continuous scale rather than a binary label. By modeling homogeneous mixed text with y = đ_λ(x; z) and predicting a change magnitude Î(x,y) via similarity-based targets, the authors train a regression head that maps edited text y to a score reflecting AI editing extent. They construct a large homogeneous mixed-text dataset with synthetic AI edits, derive two intermediate supervision metrics (cosine similarity of Linq-Embed-Mistral and soft n-grams), and validate these against human judgments, then fine-tune a Mistral/Llama backbone with QLoRA. EditLens achieves state-of-the-art performance on binary and ternary detection tasks, generalizes to unseen prompts, domains, and human-edited AI text, and provides valuable case studies (Grammarly, BEEMO). The approach enables flexible policy, reduces false positives, and contributes a publicly available dataset and models to spur further research in measured AI usage in writing.
Abstract
A significant proportion of queries to large language models ask them to edit user-provided text, rather than generate new text from scratch. While previous work focuses on detecting fully AI-generated text, we demonstrate that AI-edited text is distinguishable from human-written and AI-generated text. First, we propose using lightweight similarity metrics to quantify the magnitude of AI editing present in a text given the original human-written text and validate these metrics with human annotators. Using these similarity metrics as intermediate supervision, we then train EditLens, a regression model that predicts the amount of AI editing present within a text. Our model achieves state-of-the-art performance on both binary (F1=94.7%) and ternary (F1=90.4%) classification tasks in distinguishing human, AI, and mixed writing. Not only do we show that AI-edited text can be detected, but also that the degree of change made by AI to human writing can be detected, which has implications for authorship attribution, education, and policy. Finally, as a case study, we use our model to analyze the effects of AI-edits applied by Grammarly, a popular writing assistance tool. To encourage further research, we commit to publicly releasing our models and dataset.
