The Erosion of LLM Signatures: Can We Still Distinguish Human and LLM-Generated Scientific Ideas After Iterative Paraphrasing?

Sadat Shahriar; Navid Ayoobi; Arjun Mukherjee

The Erosion of LLM Signatures: Can We Still Distinguish Human and LLM-Generated Scientific Ideas After Iterative Paraphrasing?

Sadat Shahriar, Navid Ayoobi, Arjun Mukherjee

TL;DR

This work tackles the problem of distinguishing human- vs LLM-generated scientific ideas after iterative paraphrasing. It builds a large, multi-stage paraphrase dataset by extracting research problems, generating ideas with six LLMs, and applying five paraphrase stages, then evaluates multiple classifiers and embeddings. The key finding is a substantial erosion of detectable LLM signatures across paraphrase stages, with an average decline of 25.4% in detection performance, though incorporating the research problem as context can yield notable gains up to 2.97%. The study highlights the limitations of surface-level linguistic cues for attribution and suggests directions for stronger signal extraction through RP-idea integration and deeper reasoning traces.

Abstract

With the increasing reliance on LLMs as research agents, distinguishing between LLM and human-generated ideas has become crucial for understanding the cognitive nuances of LLMs' research capabilities. While detecting LLM-generated text has been extensively studied, distinguishing human vs LLM-generated scientific idea remains an unexplored area. In this work, we systematically evaluate the ability of state-of-the-art (SOTA) machine learning models to differentiate between human and LLM-generated ideas, particularly after successive paraphrasing stages. Our findings highlight the challenges SOTA models face in source attribution, with detection performance declining by an average of 25.4\% after five consecutive paraphrasing stages. Additionally, we demonstrate that incorporating the research problem as contextual information improves detection performance by up to 2.97%. Notably, our analysis reveals that detection algorithms struggle significantly when ideas are paraphrased into a simplified, non-expert style, contributing the most to the erosion of distinguishable LLM signatures.

The Erosion of LLM Signatures: Can We Still Distinguish Human and LLM-Generated Scientific Ideas After Iterative Paraphrasing?

TL;DR

Abstract

The Erosion of LLM Signatures: Can We Still Distinguish Human and LLM-Generated Scientific Ideas After Iterative Paraphrasing?

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)