LiteraryTaste: A Preference Dataset for Creative Writing Personalization
John Joon Young Chung, Vishakh Padmakumar, Melissa Roemmele, Yi Wang, Yuqian Sun, Tiffany Wang, Shm Garanganao Almeda, Brett A. Halperin, Yuwen Lu, Max Kreminski
TL;DR
LiteraryTaste introduces a real-user dataset for creative writing personalization, pairing 60 annotators' stated reading preferences with revealed preferences over 100 short-text pairs. The authors systematically evaluate modeling approaches, finding that fine-tuning a transformer encoder (ModernBERT-large) achieves the best personal-preference accuracy ($$0.758$$) with $90$ training samples, and remains competitive with as few as $15$ samples, highlighting sample efficiency. Aggregated (group) preferences are harder to predict than individual preferences, though LLM prompting can sometimes outperform certain baselines in zero-shot settings. Stated preferences provide limited, sometimes helpful signals, but integrating them with revealed preferences yields inconsistent gains; the work also offers a detailed qualitative analysis of preference dimensions and a practical guide for eliciting personal preferences in creative-writing tools.
Abstract
People have different creative writing preferences, and large language models (LLMs) for these tasks can benefit from adapting to each user's preferences. However, these models are often trained over a dataset that considers varying personal tastes as a monolith. To facilitate developing personalized creative writing LLMs, we introduce LiteraryTaste, a dataset of reading preferences from 60 people, where each person: 1) self-reported their reading habits and tastes (stated preference), and 2) annotated their preferences over 100 pairs of short creative writing texts (revealed preference). With our dataset, we found that: 1) people diverge on creative writing preferences, 2) finetuning a transformer encoder could achieve 75.8% and 67.7% accuracy when modeling personal and collective revealed preferences, and 3) stated preferences had limited utility in modeling revealed preferences. With an LLM-driven interpretability pipeline, we analyzed how people's preferences vary. We hope our work serves as a cornerstone for personalizing creative writing technologies.
