Augmenting Online Meetings with Context-Aware Real-time Music Generation
Haruki Suzawa, Ko Watanabe, Andreas Dengel, Shoya Ishimaru
TL;DR
This work investigates using GenAI to generate context-aware background music during online meetings to counter cognitive fatigue and boost engagement. It introduces Discussion Jockey 2, a transcripts-driven pipeline that uses Whisper for speech-to-text, GPT-4 to craft music prompts, and MusicGen to produce real-time music that loops for continuous playback over a meeting session. In a 14-participant online interview study, the system yielded higher reported relaxation and concentration, with generally positive reception but highlighted the need for personalization and faster real-time processing. The findings demonstrate the potential of context-aware musical augmentation to improve perceived ease and focus in virtual meetings and guide future enhancements for personalization and environmental adaptation.
Abstract
As online communication continues to expand, participants often face cognitive fatigue and reduced engagement. Cognitive augmentation, which leverages technology to enhance human abilities, offers promising solutions to these challenges. In this study, we investigate the potential of generative artificial intelligence (GenAI) for real-time music generation to enrich online meetings. We introduce Discussion Jockey 2, a system that dynamically produces background music in response to live conversation transcripts. Through a user study involving 14 participants in an online interview setting, we examine the system's impact on relaxation, concentration, and overall user experience. The findings reveal that AI-generated background music significantly enhances user relaxation (average score: 5.75/9) and concentration (average score: 5.86/9). This research underscores the promise of context-aware music generation in improving the quality of online communication and points to future directions for optimizing its implementation across various virtual environments.
