VirtuWander: Enhancing Multi-modal Interaction for Virtual Tour Guidance through Large Language Models
Zhan Wang, Lin-Ping Yuan, Liangwei Wang, Bingchuan Jiang, Wei Zeng
TL;DR
The paper tackles the challenge of delivering flexible, personalized tour guidance in virtual museums by leveraging multi-modal interactions powered by large language models. It introduces VirtuWander, a two-stage, pack-of-bots system that converts natural language queries into context-aware guidance across voice, avatar, text, and visualization modalities. Through a formative study, a design framework, five feedback designs, three example cases, and a VR user study, the work demonstrates that LLM-enabled multi-modal feedback can enhance engagement, knowledge delivery, and personalization, while highlighting timing and cognitive-load considerations. The results indicate strong potential for real-world deployment and AR extensions, with implications for broader domains such as airports or hospitals and for evolving VR/AR tour guidance design.
Abstract
Tour guidance in virtual museums encourages multi-modal interactions to boost user experiences, concerning engagement, immersion, and spatial awareness. Nevertheless, achieving the goal is challenging due to the complexity of comprehending diverse user needs and accommodating personalized user preferences. Informed by a formative study that characterizes guidance-seeking contexts, we establish a multi-modal interaction design framework for virtual tour guidance. We then design VirtuWander, a two-stage innovative system using domain-oriented large language models to transform user inquiries into diverse guidance-seeking contexts and facilitate multi-modal interactions. The feasibility and versatility of VirtuWander are demonstrated with virtual guiding examples that encompass various touring scenarios and cater to personalized preferences. We further evaluate VirtuWander through a user study within an immersive simulated museum. The results suggest that our system enhances engaging virtual tour experiences through personalized communication and knowledgeable assistance, indicating its potential for expanding into real-world scenarios.
