L2HCount:Generalizing Crowd Counting from Low to High Crowd Density via Density Simulation

Guoliang Xu; Jianqin Yin; Ren Zhang; Yonghao Dang; Feng Zhou; Bo Yu

L2HCount:Generalizing Crowd Counting from Low to High Crowd Density via Density Simulation

Guoliang Xu, Jianqin Yin, Ren Zhang, Yonghao Dang, Feng Zhou, Bo Yu

TL;DR

This paper tackles the challenge of generalizing crowd counting from low- to high-density scenes by introducing L2HCount, a framework that synthesizes high-density images from low-density ones using a High-Density Simulation Module and automatically generates corresponding ground-truth annotations via GTGM. It further refines the simulated data with a Head Feature Enhancement Module and learns both density regimes through a Dual-Density Memory Encoding Module that leverages separate Low-Density and High-Density memory banks. The method, validated on four popular datasets, consistently outperforms fully supervised, domain adaptation, and domain generalization baselines in low-to-high-density transfer tasks, demonstrating strong density-gap generalization without target-domain annotation. Collectively, L2HCount offers a practical pathway to robust crowd counting across varying densities, reducing labeling burden while improving counting accuracy in real-world surveillance scenarios.

Abstract

Since COVID-19, crowd-counting tasks have gained wide applications. While supervised methods are reliable, annotation is more challenging in high-density scenes due to small head sizes and severe occlusion, whereas it's simpler in low-density scenes. Interestingly, can we train the model in low-density scenes and generalize it to high-density scenes? Therefore, we propose a low- to high-density generalization framework (L2HCount) that learns the pattern related to high-density scenes from low-density ones, enabling it to generalize well to high-density scenes. Specifically, we first introduce a High-Density Simulation Module and a Ground-Truth Generation Module to construct fake high-density images along with their corresponding ground-truth crowd annotations respectively by image-shifting technique, effectively simulating high-density crowd patterns. However, the simulated images have two issues: image blurring and loss of low-density image characteristics. Therefore, we second propose a Head Feature Enhancement Module to extract clear features in the simulated high-density scene. Third, we propose a Dual-Density Memory Encoding Module that uses two crowd memories to learn scene-specific patterns from low- and simulated high-density scenes, respectively. Extensive experiments on four challenging datasets have shown the promising performance of L2HCount.

L2HCount:Generalizing Crowd Counting from Low to High Crowd Density via Density Simulation

TL;DR

Abstract

L2HCount:Generalizing Crowd Counting from Low to High Crowd Density via Density Simulation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)