Table of Contents
Fetching ...

Boosting Adverse Weather Crowd Counting via Multi-queue Contrastive Learning

Tianhang Pan, Xiuyi Jia

TL;DR

This paper tackles the robustness of crowd counting under adverse weather and weather-class imbalance. It introduces MQCL, a two-stage framework comprising Weather-aware Representation Learning (WRL) and Contrastive-learning-guided Representation Refining (CRR). WRL employs a novel Multi-queue MoCo to learn weather-discriminative representations under imbalance, while CRR freezes the encoder and trains a lightweight refiner to map adverse-weather representations to the normal domain, guided by positives from normal-weather queues; the losses $\mathcal{L}_{contra1}$, $\mathcal{L}_{contra2}$, and the Bayesian density supervision drive both stages. Experiments on the JHU-Crowd++ and GCC datasets show substantial improvements in adverse-weather MAE/RMSE (around 22% reductions) with modest computational overhead, achieving state-of-the-art results. The approach is plug-and-play with multiple backbones and offers a practical solution to imbalanced multi-domain crowd counting, with potential applicability to other vision problems facing domain gaps and class imbalance.

Abstract

Currently, most crowd counting methods have outstanding performance under normal weather conditions. However, our experimental validation reveals two key obstacles limiting the accuracy improvement of crowd counting models: 1) the domain gap between the adverse weather and the normal weather images; 2) the weather class imbalance in the training set. To address the problems, we propose a two-stage crowd counting method named Multi-queue Contrastive Learning (MQCL). Specifically, in the first stage, our target is to equip the backbone network with weather-awareness capabilities. In this process, a contrastive learning method named multi-queue MoCo designed by us is employed to enable representation learning under weather class imbalance. After the first stage is completed, the backbone model is "mature" enough to extract weather-related representations. On this basis, we proceed to the second stage, in which we propose to refine the representations under the guidance of contrastive learning, enabling the conversion of the weather-aware representations to the normal weather domain. Through such representation and conversion, the model achieves robust counting performance under both normal and adverse weather conditions. Extensive experimental results show that, compared to the baseline, MQCL reduces the counting error under adverse weather conditions by 22%, while introducing only about 13% increase in computational burden, which achieves state-of-the-art performance.

Boosting Adverse Weather Crowd Counting via Multi-queue Contrastive Learning

TL;DR

This paper tackles the robustness of crowd counting under adverse weather and weather-class imbalance. It introduces MQCL, a two-stage framework comprising Weather-aware Representation Learning (WRL) and Contrastive-learning-guided Representation Refining (CRR). WRL employs a novel Multi-queue MoCo to learn weather-discriminative representations under imbalance, while CRR freezes the encoder and trains a lightweight refiner to map adverse-weather representations to the normal domain, guided by positives from normal-weather queues; the losses , , and the Bayesian density supervision drive both stages. Experiments on the JHU-Crowd++ and GCC datasets show substantial improvements in adverse-weather MAE/RMSE (around 22% reductions) with modest computational overhead, achieving state-of-the-art results. The approach is plug-and-play with multiple backbones and offers a practical solution to imbalanced multi-domain crowd counting, with potential applicability to other vision problems facing domain gaps and class imbalance.

Abstract

Currently, most crowd counting methods have outstanding performance under normal weather conditions. However, our experimental validation reveals two key obstacles limiting the accuracy improvement of crowd counting models: 1) the domain gap between the adverse weather and the normal weather images; 2) the weather class imbalance in the training set. To address the problems, we propose a two-stage crowd counting method named Multi-queue Contrastive Learning (MQCL). Specifically, in the first stage, our target is to equip the backbone network with weather-awareness capabilities. In this process, a contrastive learning method named multi-queue MoCo designed by us is employed to enable representation learning under weather class imbalance. After the first stage is completed, the backbone model is "mature" enough to extract weather-related representations. On this basis, we proceed to the second stage, in which we propose to refine the representations under the guidance of contrastive learning, enabling the conversion of the weather-aware representations to the normal weather domain. Through such representation and conversion, the model achieves robust counting performance under both normal and adverse weather conditions. Extensive experimental results show that, compared to the baseline, MQCL reduces the counting error under adverse weather conditions by 22%, while introducing only about 13% increase in computational burden, which achieves state-of-the-art performance.
Paper Structure (15 sections, 4 equations, 5 figures, 7 tables)

This paper contains 15 sections, 4 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: (a) and (b) shows the the weather condition distribution pie charts of the JHU-Crowd++ dataset and the GCC dataset, respectively. (c) and (d) present the counting accuracy of different models on normal-weather and adverse-weather images respectively, under varying degrees of weather-class imbalance in the training data. The horizontal axis represents the number of normal/hazy/rainy/snowy weather images in the training set. The vertical axis represents the Mean Absolute Error (MAE) of models on the JHU-Crowd++ test set. All training data are sampled from the JHU-Crowd++ training set.
  • Figure 2: The architecture of MQCL. The target of the WRL stage is to learn weather-aware representations via unsupervised contrastive learning. After the WRL stage, a refiner is trained in the CRR stage to pull the adverse weather representations towards normal weather domain.
  • Figure 3: The architecture of multi-queue MoCo. The projection heads project the representations to 1-D vectors. In the multiple queues, each sub queue is of equal length and corresponds to one weather class.
  • Figure 4: Visualizations of our baseline ConvNeXt-T and our MQCL. The first row are the input images. The second and the third rows are the density maps predicted by ConvNeXt and our method, respectively. The images are sampled from the JHU dataset, including weather conditions of normal, haze, rain and snow.
  • Figure 5: The t-SNE van2008visualizing visualization of the vectors Q after the WRL stage on the JHU-Crowd++ dataset using memory bank (a), single-queue MoCo (b) and multi-queue MoCo (c), respectively.