Learning Extremely High Density Crowds as Active Matters
Feixiang He, Jiangbei Yue, Jialin Zhu, Armin Seyfried, Dan Casas, Julien Pettré, He Wang
TL;DR
This work tackles the challenging problem of analyzing and forecasting extremely high-density crowds from in-the-wild video data. It introduces a novel crowd-material framework that treats crowds as active matter, described by a continuum with learnable stress and stochastic active forces, and solved with a differentiable neural stochastic differential equation system. The core contribution is CrowdMPM, a hybrid Eulerian–Lagrangian method that uses per-particle parameters and a CVAE-guided stochastic forcing to capture complex density-dependent dynamics, enabling both analysis and simulation with strong interpretability. The approach demonstrates superior prediction accuracy across several real-world high-density datasets, offers continuous-time predictions without fixed timesteps, and provides a controllable simulator to study “what-if” scenarios, such as modifying exits or obstacles. Together, the framework advances high-density crowd modeling by unifying learnable material properties with active-matter dynamics in a continuous-time, physics-informed setting, with practical implications for safety and crowd management.
Abstract
Video-based high-density crowd analysis and prediction has been a long-standing topic in computer vision. It is notoriously difficult due to, but not limited to, the lack of high-quality data and complex crowd dynamics. Consequently, it has been relatively under studied. In this paper, we propose a new approach that aims to learn from in-the-wild videos, often with low quality where it is difficult to track individuals or count heads. The key novelty is a new physics prior to model crowd dynamics. We model high-density crowds as active matter, a continumm with active particles subject to stochastic forces, named 'crowd material'. Our physics model is combined with neural networks, resulting in a neural stochastic differential equation system which can mimic the complex crowd dynamics. Due to the lack of similar research, we adapt a range of existing methods which are close to ours for comparison. Through exhaustive evaluation, we show our model outperforms existing methods in analyzing and forecasting extremely high-density crowds. Furthermore, since our model is a continuous-time physics model, it can be used for simulation and analysis, providing strong interpretability. This is categorically different from most deep learning methods, which are discrete-time models and black-boxes.
