PaperWave: Listening to Research Papers as Conversational Podcasts Scripted by LLM
Yuchi Yahagi, Rintaro Chujo, Yuga Harada, Changyo Han, Kohei Sugiyama, Takeshi Naemura
TL;DR
PaperWave investigates transforming research papers into conversational podcasts via LLMs to enable mobile, listening-based engagement with niche scholarly content. Through fieldwork, autobiographical design, and a design workshop with 11 participants, the study reveals that podcast-style papers can lower barriers to engagement and shift attention to different aspects of papers, while raising concerns about accuracy and missing visuals. The authors prototype three iterations (manual ChatGPT, CLI automation, and a web app) to deliver PDF-to-audio workflows with configurable language, duration, and playback. Findings emphasize the need to consider listener-environment interaction and audience variation when designing document-to-audio systems, as well as the potential for broader topic exploration through sharing and mobile listening. Limitations include non-generalizability, potential biases, and lack of personalization, pointing to a future research space integrating reading-support and multimodal augmentation.
Abstract
Listening to audio content, such as podcasts and audiobooks, is one way for people to engage with knowledge. Listening affords people more mobility than reading by seeing, thereby broadening their learning opportunities. This study explores the potential applications of large language models (LLMs) to adapt text documents to audio content and addresses the lack of listening-friendly materials for niche content, such as research papers. LLMs can generate scripts of audio content in various styles tailored to specific needs, such as full-content duration or speech types (monologue or dialogue). To explore this potential, we developed PaperWave as a prototype that transforms academic paper PDFs into conversational podcasts. Our two-month investigation, involving 11 participants (including the authors), employed an autobiographical design, a field study, and a design workshop. The findings highlight the importance of considering listener interaction with their environment when designing document-to-audio systems.
