Generation, Evaluation, and Explanation of Novelists' Styles with Single-Token Prompts
Mosab Rezaei, Mina Rajaei Moghadam, Abdul Rahman Shaikh, Hamed Alhoori, Reva Freedman
TL;DR
This work tackles stylometric generation under non-parallel data by employing single-token prompts to imitate five 19th-century authors. It couples GPT-Neo fine-tuning (FFT and LoRA) with a DeBERTa V3-Large classifier for evaluation and uses Attention Enrichment and Integrated Gradients to explain the generation process. The results show that FFT yields higher stylistic fidelity than LoRA, and AI-based evaluation aligns well with intended styles, with robust token-level cues identified for each author. The approach offers a scalable, interpretable toolkit for digital libraries and literary analysis, enabling both preservation and creative exploration of authorial styles at scale.
Abstract
Recent advances in large language models have created new opportunities for stylometry, the study of writing styles and authorship. Two challenges, however, remain central: training generative models when no paired data exist, and evaluating stylistic text without relying only on human judgment. In this work, we present a framework for both generating and evaluating sentences in the style of 19th-century novelists. Large language models are fine-tuned with minimal, single-token prompts to produce text in the voices of authors such as Dickens, Austen, Twain, Alcott, and Melville. To assess these generative models, we employ a transformer-based detector trained on authentic sentences, using it both as a classifier and as a tool for stylistic explanation. We complement this with syntactic comparisons and explainable AI methods, including attention-based and gradient-based analyses, to identify the linguistic cues that drive stylistic imitation. Our findings show that the generated text reflects the authors' distinctive patterns and that AI-based evaluation offers a reliable alternative to human assessment. All artifacts of this work are published online.
