Table of Contents
Fetching ...

2D Gaussians Spatial Transport for Point-supervised Density Regression

Miao Shang, Xiaopeng Hong

TL;DR

The paper addresses the computational bottleneck of optimal transport (OT) in point-supervised density regression by introducing Gaussian Spatial Transport (GST). GST builds a fixed, precomputable transport kernel from the input image using 2D Gaussian Splatting, enabling the network to be trained via a single matrix multiplication to push the predicted density to the annotation space. A Bayesian transport-based loss with the kernel, L_BT = || K' ilde{\zeta}_d - \zeta_g ||_1, replaces iterative OT optimization, greatly reducing training time while maintaining or improving accuracy. The method is demonstrated on crowd counting and landmark localization, showing competitive or state-of-the-art performance and substantial efficiency gains, with ablations validating the contribution of deformity elimination and background correspondence. Code is provided to reproduce the GST pipeline.

Abstract

This paper introduces Gaussian Spatial Transport (GST), a novel framework that leverages Gaussian splatting to facilitate transport from the probability measure in the image coordinate space to the annotation map. We propose a Gaussian splatting-based method to estimate pixel-annotation correspondence, which is then used to compute a transport plan derived from Bayesian probability. To integrate the resulting transport plan into standard network optimization in typical computer vision tasks, we derive a loss function that measures discrepancy after transport. Extensive experiments on representative computer vision tasks, including crowd counting and landmark detection, validate the effectiveness of our approach. Compared to conventional optimal transport schemes, GST eliminates iterative transport plan computation during training, significantly improving efficiency. Code is available at https://github.com/infinite0522/GST.

2D Gaussians Spatial Transport for Point-supervised Density Regression

TL;DR

The paper addresses the computational bottleneck of optimal transport (OT) in point-supervised density regression by introducing Gaussian Spatial Transport (GST). GST builds a fixed, precomputable transport kernel from the input image using 2D Gaussian Splatting, enabling the network to be trained via a single matrix multiplication to push the predicted density to the annotation space. A Bayesian transport-based loss with the kernel, L_BT = || K' ilde{\zeta}_d - \zeta_g ||_1, replaces iterative OT optimization, greatly reducing training time while maintaining or improving accuracy. The method is demonstrated on crowd counting and landmark localization, showing competitive or state-of-the-art performance and substantial efficiency gains, with ablations validating the contribution of deformity elimination and background correspondence. Code is provided to reproduce the GST pipeline.

Abstract

This paper introduces Gaussian Spatial Transport (GST), a novel framework that leverages Gaussian splatting to facilitate transport from the probability measure in the image coordinate space to the annotation map. We propose a Gaussian splatting-based method to estimate pixel-annotation correspondence, which is then used to compute a transport plan derived from Bayesian probability. To integrate the resulting transport plan into standard network optimization in typical computer vision tasks, we derive a loss function that measures discrepancy after transport. Extensive experiments on representative computer vision tasks, including crowd counting and landmark detection, validate the effectiveness of our approach. Compared to conventional optimal transport schemes, GST eliminates iterative transport plan computation during training, significantly improving efficiency. Code is available at https://github.com/infinite0522/GST.

Paper Structure

This paper contains 25 sections, 1 theorem, 12 equations, 6 figures, 6 tables.

Key Result

Theorem 1

For probability distributions ${\boldsymbol{P}}_X$ on $\mathcal{X}$ and ${\boldsymbol{P}}_Y$ on $\mathcal{Y}$, there exists a transport plan $\hat{\boldsymbol{P}} \in \mathcal{U}({\boldsymbol{P}}_X, {\boldsymbol{P}}_Y)$ that can be expressed as $\hat{\boldsymbol{P}} = \text{diag}({\boldsymbol{P}}_X)

Figures (6)

  • Figure 1: Comparison of loss computing of Direct Regression, OT, and the proposed GST.
  • Figure 2: The GST Pipeline comprises two main components: transport kernel generation and model training. First, the transport kernel $\boldsymbol{\mathcal{K}}$ is generated before training by reconstructing the RGB image via 2D Gaussian splatting and establishing pixel-to-annotation correspondences (Eq. \ref{['eq:P(x|y)']}) to then form $\boldsymbol{\mathcal{K}}$ (Eq. \ref{['eq:kernel_bay2']}). Second, during training, $\boldsymbol{\mathcal{K}}$ transports the estimated density map to annotations, allowing for the computation of the transported mass discrepancy loss (Eq. \ref{['eq:loss_bt']}).
  • Figure 3: Visualization of crowd counting.
  • Figure 4: Visualization of landmark location.
  • Figure 5: Visualization of transport plan w/ and w/o deformity elimination (DE) during Gaussian splatting. The map shows annotation-to-pixel correspondence: black points are annotations, unique colors represent individual targets with brightness indicating transport strength, and blank regions denote background transport.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Theorem 1