Sentiment Matters: An Analysis of 200 Human-SAV Interactions
Lirui Guo, Michael G. Burke, Wynita M. Griggs
TL;DR
This work addresses the need for understanding how human–SAV conversational sentiment and prompting strategies influence user acceptance and service perception. It introduces an open-source dataset of 2,136 SAV exchanges and 200 post-interaction surveys collected from 50 participants interacting with four GPT-3.5‑driven SAV agents under varied prompts. The authors demonstrate two benchmarks: Case Study 1 uses a predictive modeling plus chord-diagram framework to identify item-level drivers of SAV acceptance, revealing that sentiment polarity becomes a key predictor under certain prompts; Case Study 2 compares an LLM-based sentiment analyzer with TextBlob, finding modest but superior alignment of the LLM approach with self-reported sentiment, and highlighting limitations due to the text-only signal and contextual factors. The dataset and findings offer actionable guidance for sentiment-aware, adaptive SAV interfaces and establish a foundation for future multimodal sentiment modeling in autonomous vehicle interactions. The work emphasizes practical implications for real-time sentiment monitoring and prompts design while acknowledging the need for broader participant samples and richer cues to enhance predictive accuracy.
Abstract
Shared Autonomous Vehicles (SAVs) are likely to become an important part of the transportation system, making effective human-SAV interactions an important area of research. This paper introduces a dataset of 200 human-SAV interactions to further this area of study. We present an open-source human-SAV conversational dataset, comprising both textual data (e.g., 2,136 human-SAV exchanges) and empirical data (e.g., post-interaction survey results on a range of psychological factors). The dataset's utility is demonstrated through two benchmark case studies: First, using random forest modeling and chord diagrams, we identify key predictors of SAV acceptance and perceived service quality, highlighting the critical influence of response sentiment polarity (i.e., perceived positivity). Second, we benchmark the performance of an LLM-based sentiment analysis tool against the traditional lexicon-based TextBlob method. Results indicate that even simple zero-shot LLM prompts more closely align with user-reported sentiment, though limitations remain. This study provides novel insights for designing conversational SAV interfaces and establishes a foundation for further exploration into advanced sentiment modeling, adaptive user interactions, and multimodal conversational systems.
