Table of Contents
Fetching ...

On the Theory of Conditional Feature Alignment for Unsupervised Domain-Adaptive Counting

Zhuonan Liang, Dongnan Liu, Jianan Fan, Yaxuan Song, Qiang Qu, Runnan Chen, Yu Yao, Peng Fu, Weidong Cai

TL;DR

This work tackles cross-domain object counting where density variation is task-relevant and can invalidate standard domain-adaptation assumptions. It introduces a conditional divergence framework that partitions samples into condition-defined subsets (e.g., foreground vs background) and proves a joint-error bound that favors conditional alignment over unconditional matching. The approach partitions images with pseudo-labels, learns partition-specific features with a shared regressor, applies per-partition adversarial alignment, and employs a Condition-consistent Mechanism to refine pseudo-labels via a consistency loss. Empirical results on crowd and cell counting benchmarks show consistent improvements over state-of-the-art unsupervised DA methods, validating both the theory and the practical effectiveness of preserving task-relevant density variations during adaptation.

Abstract

Object counting models suffer when deployed across domains with differing density variety, since density shifts are inherently task-relevant and violate standard domain adaptation assumptions. To address this, we propose a theoretical framework of conditional feature alignment and provide a straightforward implementation. By theoretical analysis, our framework is feasible to achieve superior cross-domain generalization for counting. In the presented network, the features related to density are explicitly preserved across domains. Theoretically, we formalize the notion of conditional divergence by partitioning each domain into subsets and measuring divergences per condition. We then derive a joint error bound showing that, under discrete label spaces treated as condition sets, aligning distributions conditionally leads to tighter bounds on the combined source-target decision error than unconditional alignment. Empirically, we demonstrate the effectiveness of our approach through extensive experiments on multiple counting datasets with varying density distributions. The results show that our method outperforms existing unsupervised domain adaptation methods, empirically validating the theoretical insights on conditional feature alignment.

On the Theory of Conditional Feature Alignment for Unsupervised Domain-Adaptive Counting

TL;DR

This work tackles cross-domain object counting where density variation is task-relevant and can invalidate standard domain-adaptation assumptions. It introduces a conditional divergence framework that partitions samples into condition-defined subsets (e.g., foreground vs background) and proves a joint-error bound that favors conditional alignment over unconditional matching. The approach partitions images with pseudo-labels, learns partition-specific features with a shared regressor, applies per-partition adversarial alignment, and employs a Condition-consistent Mechanism to refine pseudo-labels via a consistency loss. Empirical results on crowd and cell counting benchmarks show consistent improvements over state-of-the-art unsupervised DA methods, validating both the theory and the practical effectiveness of preserving task-relevant density variations during adaptation.

Abstract

Object counting models suffer when deployed across domains with differing density variety, since density shifts are inherently task-relevant and violate standard domain adaptation assumptions. To address this, we propose a theoretical framework of conditional feature alignment and provide a straightforward implementation. By theoretical analysis, our framework is feasible to achieve superior cross-domain generalization for counting. In the presented network, the features related to density are explicitly preserved across domains. Theoretically, we formalize the notion of conditional divergence by partitioning each domain into subsets and measuring divergences per condition. We then derive a joint error bound showing that, under discrete label spaces treated as condition sets, aligning distributions conditionally leads to tighter bounds on the combined source-target decision error than unconditional alignment. Empirically, we demonstrate the effectiveness of our approach through extensive experiments on multiple counting datasets with varying density distributions. The results show that our method outperforms existing unsupervised domain adaptation methods, empirically validating the theoretical insights on conditional feature alignment.

Paper Structure

This paper contains 25 sections, 6 theorems, 36 equations, 6 figures, 4 tables, 1 algorithm.

Key Result

Theorem 1

Combining the definition of the joint error $\epsilon_U = \epsilon_{\mathcal{Z}}(h) + \epsilon_{\mathcal{Z}'}(h)$ and the unified feature space $\mathcal{Z}_U$, the following lower bound holds:

Figures (6)

  • Figure 1: Comparison between existing domain adaptation (DA) methods and our approach. It shows that the general DA methods treat task-relevant factors as features that need to be directly aligned. The aligned distribution of density leads to consistent density estimation across domains. However, the consistent density does not match the real density in the samples. In our method, we only align the distributions of features belonging to objects of interest, so that the inter-object information can be preserved.
  • Figure 2: Overview of our proposed framework. $g_s$ and $g_t$ are domain-specific feature extractors for source and target domain. $f_d$ is the domain discriminator for aligning. $f$ is the regressor for generating target density map. $f_c$ is the regressor for generating conditional density map with shared weights with $f$.
  • Figure 3: Object counting scenarios: (a) public security monitoring; (b) medical pathological analysis; (c) biological experiment.
  • Figure 4: The tendency of validation counting MAE and the consistency on two domain combinations.
  • Figure 5: Dot map visualization. Randomly selected eight low-density samples from two adaptation tasks. From left to right, the samples are from ADI, DCC, UCF, SHB. The red mark indicates the miss count. The blue mark indicates the duplicated count.
  • ...and 1 more figures

Theorems & Definitions (14)

  • Definition 1
  • Definition 2: Divergence Measurement
  • Definition 3: Conditional Subset
  • Definition 4: Conditional Divergence
  • Remark 1
  • Theorem 1: Joint Error Lower Bound
  • Lemma 2: Conditional Label
  • Lemma 3: Partial Divergence
  • Lemma 4: Partition-Estimation Error Bound
  • Theorem 5: Conditional Alignment
  • ...and 4 more