Sportify: Question Answering with Embedded Visualizations and Personified Narratives for Sports Video
Chunggi Lee, Tica Lin, Hanspeter Pfister, Chen Zhu-Tian
TL;DR
Sportify addresses fan comprehension of basketball tactics by integrating embedded visualizations with LLM-generated narratives in a Visual Question Answering framework. The system detects tactics and actions from video data, renders three action visualizations (Pass, Cut, Screen), and presents narratives from first- or third-person perspectives using a retrieval-augmented generation pipeline with ReAct prompting. Two user studies show that embedded visuals and narratives improve understanding and engagement compared with text alone or YouTube tactic videos, with first-person narration boosting enjoyment and third-person narration providing a sense of control. The work advances on-grounded, interactive tactic explanations for sports videos and highlights opportunities for future multi-modal LLM integration and defense tactic coverage.
Abstract
As basketball's popularity surges, fans often find themselves confused and overwhelmed by the rapid game pace and complexity. Basketball tactics, involving a complex series of actions, require substantial knowledge to be fully understood. This complexity leads to a need for additional information and explanation, which can distract fans from the game. To tackle these challenges, we present Sportify, a Visual Question Answering system that integrates narratives and embedded visualization for demystifying basketball tactical questions, aiding fans in understanding various game aspects. We propose three novel action visualizations (i.e., Pass, Cut, and Screen) to demonstrate critical action sequences. To explain the reasoning and logic behind players' actions, we leverage a large-language model (LLM) to generate narratives. We adopt a storytelling approach for complex scenarios from both first and third-person perspectives, integrating action visualizations. We evaluated Sportify with basketball fans to investigate its impact on understanding of tactics, and how different personal perspectives of narratives impact the understanding of complex tactic with action visualizations. Our evaluation with basketball fans demonstrates Sportify's capability to deepen tactical insights and amplify the viewing experience. Furthermore, third-person narration assists people in getting in-depth game explanations while first-person narration enhances fans' game engagement
