Impact of Voice Fidelity on Decision Making: A Potential Dark Pattern?
Mateusz Dubiel, Anastasia Sergeeva, Luis A. Leiva
TL;DR
The paper investigates whether voice fidelity and prosody in synthetic speech can subtly bias decision making, framing this as a potential dark pattern in voice interfaces. It employs a two-stage design with $N=50$ for a voice-perception study and $N=101$ for a decision-making task, comparing Standard TTS to neural voices. Findings show neural, high-fidelity voices are rated more favorably and bias choices toward their presented options, while participants underestimate the influence of voice on their decisions. The authors discuss ethical guidelines, user customization, and domain-tailored interaction to mitigate manipulation, highlighting regulatory and practical implications for the design of voice-based agents.
Abstract
Manipulative design in user interfaces (conceptualized as dark patterns) has emerged as a significant impediment to the ethical design of technology and a threat to user agency and freedom of choice. While previous research focused on exploring these patterns in the context of graphical user interfaces, the impact of speech has largely been overlooked. We conducted a listening test (N = 50) to elicit participants' preferences regarding different synthetic voices that varied in terms of synthesis method (concatenative vs. neural) and prosodic qualities (speech pace and pitch variance), and then evaluated their impact in an online decision-making study (N = 101). Our results indicate a significant effect of voice qualities on the participant's choices, independently from the content of the available options. Our results also indicate that the voice's perceived engagement, ease of understanding, and domain fit directly translate to its impact on participants' behaviour in decision-making tasks.
