Beyond the Individual: Introducing Group Intention Forecasting with SHOT Dataset
Ruixu Zhang, Yuran Wang, Xinyi Hu, Chaoyu Mai, Wenxuan Liu, Danni Xu, Xian Zhong, Zheng Wang
TL;DR
This work tackles forecasting group-level intentions, a step beyond traditional individual-intention recognition, by defining the GIF task and introducing SHOT, a large-scale, multi-view dataset with rich per-player annotations. The authors propose GIFT, a spatio-temporal encoder-decoder framework that models evolving inter-player dynamics to forecast when a group intention occurs, quantified as the frame $f_\tau$ within a clip of length $T$. Experiments show SHOT's utility and that GIFT outperforms traditional temporal action localization baselines in timing accuracy (MAE), while highlighting the challenge of early-stage forecasting (lower F1) due to limited initial cues. The dataset and baseline provide a foundation for future research in group intention forecasting with broad implications for sports analytics, safety, and intelligent systems, enabling timely interventions based on emergent collective goals.
Abstract
Intention recognition has traditionally focused on individual intentions, overlooking the complexities of collective intentions in group settings. To address this limitation, we introduce the concept of group intention, which represents shared goals emerging through the actions of multiple individuals, and Group Intention Forecasting (GIF), a novel task that forecasts when group intentions will occur by analyzing individual actions and interactions before the collective goal becomes apparent. To investigate GIF in a specific scenario, we propose SHOT, the first large-scale dataset for GIF, consisting of 1,979 basketball video clips captured from 5 camera views and annotated with 6 types of individual attributes. SHOT is designed with 3 key characteristics: multi-individual information, multi-view adaptability, and multi-level intention, making it well-suited for studying emerging group intentions. Furthermore, we introduce GIFT (Group Intention ForecasTer), a framework that extracts fine-grained individual features and models evolving group dynamics to forecast intention emergence. Experimental results confirm the effectiveness of SHOT and GIFT, establishing a strong foundation for future research in group intention forecasting. The dataset is available at https://xinyi-hu.github.io/SHOT_DATASET.
