Shallow Synthesis of Knowledge in GPT-Generated Texts: A Case Study in Automatic Related Work Composition
Anna Martin-Boyle, Aahan Tyagi, Marti A. Hearst, Dongyeop Kang
TL;DR
The paper investigates whether GPT-4 can synthesize related work sections in academic papers by analyzing citation graphs across three conditions: human-written, GPT-assisted with ScholaCite, and GPT-generated. It introduces a two-step ScholaCite workflow to group citations and draft text, and evaluates outputs using edges, node degrees, density, and clustering. Results show GPT-4 alone struggles with deep synthesis, while human-guided collaboration yields citation networks comparable to human writing, emphasizing the need for human oversight. The work provides practical guidelines for responsible AI-assisted writing and highlights the limits and ethical considerations of automated scholarly text generation.
Abstract
Numerous AI-assisted scholarly applications have been developed to aid different stages of the research process. We present an analysis of AI-assisted scholarly writing generated with ScholaCite, a tool we built that is designed for organizing literature and composing Related Work sections for academic papers. Our evaluation method focuses on the analysis of citation graphs to assess the structural complexity and inter-connectedness of citations in texts and involves a three-way comparison between (1) original human-written texts, (2) purely GPT-generated texts, and (3) human-AI collaborative texts. We find that GPT-4 can generate reasonable coarse-grained citation groupings to support human users in brainstorming, but fails to perform detailed synthesis of related works without human intervention. We suggest that future writing assistant tools should not be used to draft text independently of the human author.
