Table of Contents
Fetching ...

Continuous Locomotive Crowd Behavior Generation

Inhwan Bae, Junoh Lee, Hae-Gon Jeon

TL;DR

This work tackles the challenge of generating continuous, realistic crowd trajectories from a single scene image. It introduces CrowdES, a two-module framework where a diffusion-based crowd emitter first predicts spatial layouts and samples agent attributes along a timeline, and a crowd simulator then constructs long-horizon, environment-aware trajectories using a navigation mesh and switching dynamical systems governed by a Markov chain. Key contributions include the diffusion-driven emitter with explicit scene conditioning, the SDS-based simulator that enables diverse intermediate interactions, and a new benchmark protocol assessing both scene-level realism and agent-level trajectory fidelity across long sequences. The approach demonstrates strong generalization to unseen environments, provides user-controllable parameters for population density and walking behavior, and achieves lifelike crowd animations with practical runtime, marking a step forward for realistic, controllable crowd generation in animation, robotics, and simulation contexts.

Abstract

Modeling and reproducing crowd behaviors are important in various domains including psychology, robotics, transport engineering and virtual environments. Conventional methods have focused on synthesizing momentary scenes, which have difficulty in replicating the continuous nature of real-world crowds. In this paper, we introduce a novel method for automatically generating continuous, realistic crowd trajectories with heterogeneous behaviors and interactions among individuals. We first design a crowd emitter model. To do this, we obtain spatial layouts from single input images, including a segmentation map, appearance map, population density map and population probability, prior to crowd generation. The emitter then continually places individuals on the timeline by assigning independent behavior characteristics such as agents' type, pace, and start/end positions using diffusion models. Next, our crowd simulator produces their long-term locomotions. To simulate diverse actions, it can augment their behaviors based on a Markov chain. As a result, our overall framework populates the scenes with heterogeneous crowd behaviors by alternating between the proposed emitter and simulator. Note that all the components in the proposed framework are user-controllable. Lastly, we propose a benchmark protocol to evaluate the realism and quality of the generated crowds in terms of the scene-level population dynamics and the individual-level trajectory accuracy. We demonstrate that our approach effectively models diverse crowd behavior patterns and generalizes well across different geographical environments. Code is publicly available at https://github.com/InhwanBae/CrowdES .

Continuous Locomotive Crowd Behavior Generation

TL;DR

This work tackles the challenge of generating continuous, realistic crowd trajectories from a single scene image. It introduces CrowdES, a two-module framework where a diffusion-based crowd emitter first predicts spatial layouts and samples agent attributes along a timeline, and a crowd simulator then constructs long-horizon, environment-aware trajectories using a navigation mesh and switching dynamical systems governed by a Markov chain. Key contributions include the diffusion-driven emitter with explicit scene conditioning, the SDS-based simulator that enables diverse intermediate interactions, and a new benchmark protocol assessing both scene-level realism and agent-level trajectory fidelity across long sequences. The approach demonstrates strong generalization to unseen environments, provides user-controllable parameters for population density and walking behavior, and achieves lifelike crowd animations with practical runtime, marking a step forward for realistic, controllable crowd generation in animation, robotics, and simulation contexts.

Abstract

Modeling and reproducing crowd behaviors are important in various domains including psychology, robotics, transport engineering and virtual environments. Conventional methods have focused on synthesizing momentary scenes, which have difficulty in replicating the continuous nature of real-world crowds. In this paper, we introduce a novel method for automatically generating continuous, realistic crowd trajectories with heterogeneous behaviors and interactions among individuals. We first design a crowd emitter model. To do this, we obtain spatial layouts from single input images, including a segmentation map, appearance map, population density map and population probability, prior to crowd generation. The emitter then continually places individuals on the timeline by assigning independent behavior characteristics such as agents' type, pace, and start/end positions using diffusion models. Next, our crowd simulator produces their long-term locomotions. To simulate diverse actions, it can augment their behaviors based on a Markov chain. As a result, our overall framework populates the scenes with heterogeneous crowd behaviors by alternating between the proposed emitter and simulator. Note that all the components in the proposed framework are user-controllable. Lastly, we propose a benchmark protocol to evaluate the realism and quality of the generated crowds in terms of the scene-level population dynamics and the individual-level trajectory accuracy. We demonstrate that our approach effectively models diverse crowd behavior patterns and generalizes well across different geographical environments. Code is publicly available at https://github.com/InhwanBae/CrowdES .

Paper Structure

This paper contains 20 sections, 14 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Generating realistic, continuous crowd behaviors with learned dynamics. Given a scene image, CrowdES iteratively populates the environment and synthesizes diverse locomotion patterns to create lifelike crowd scenarios. CrowdES also allows users to control parameters to achieve tailored and flexible outcomes.
  • Figure 2: An overview of our CrowdES framework. Starting with the input scene image $\mathcal{I}$, CrowdES continuously generates realistic crowd behaviors $\mathcal{V}$ by alternating between the crowd emitter and crowd simulator processes.
  • Figure 3: An overview of our recurrent crowd behavior generation approach using the crowd emitter and simulator. (Marker: Emerging crowds at specific times, Bar: Extensible crowd trajectories.)
  • Figure 4: Visualization of the time-varying behavior changes (blue man). Our CrowdES can autonomously generate realistic, long-term behavioral sequences for each agent in a scene.
  • Figure 5: Visualization of the generated behaviors compared to the real-world behaviors in the ZARA1 scene.
  • ...and 2 more figures