Table of Contents
Fetching ...

Improving Deep Regression with Tightness

Shihao Zhang, Yuguang Yan, Angela Yao

TL;DR

This work addresses why preserving target ordinality improves deep regression by linking it to representation tightness measured by $H(Z|Y)$. It reveals that standard regressors poorly tighten representations due to gradient directions, and introduces two strategies—Multiple Target (MT) learning and a Regression Optimal Transport Regularizer (ROT-Reg)—to globally and locally tighten the feature space. MT adds extra target dimensions to compress the regression solution space, while ROT-Reg aligns local transport plans between target and representation spaces using self-entropic optimal transport. Experiments on age estimation, depth estimation, and coordinate prediction show that MT and ROT-Reg improve performance and better preserve ordinality, with their combination delivering the strongest gains and faster convergence, while incurring minimal computational overhead. This approach provides a principled path to improved generalization in deep regression by enforcing both global and local structure in learned representations.

Abstract

For deep regression, preserving the ordinality of the targets with respect to the feature representation improves performance across various tasks. However, a theoretical explanation for the benefits of ordinality is still lacking. This work reveals that preserving ordinality reduces the conditional entropy $H(Z|Y)$ of representation $Z$ conditional on the target $Y$. However, our findings reveal that typical regression losses do little to reduce $H(Z|Y)$, even though it is vital for generalization performance. With this motivation, we introduce an optimal transport-based regularizer to preserve the similarity relationships of targets in the feature space to reduce $H(Z|Y)$. Additionally, we introduce a simple yet efficient strategy of duplicating the regressor targets, also with the aim of reducing $H(Z|Y)$. Experiments on three real-world regression tasks verify the effectiveness of our strategies to improve deep regression. Code: https://github.com/needylove/Regression_tightness.

Improving Deep Regression with Tightness

TL;DR

This work addresses why preserving target ordinality improves deep regression by linking it to representation tightness measured by . It reveals that standard regressors poorly tighten representations due to gradient directions, and introduces two strategies—Multiple Target (MT) learning and a Regression Optimal Transport Regularizer (ROT-Reg)—to globally and locally tighten the feature space. MT adds extra target dimensions to compress the regression solution space, while ROT-Reg aligns local transport plans between target and representation spaces using self-entropic optimal transport. Experiments on age estimation, depth estimation, and coordinate prediction show that MT and ROT-Reg improve performance and better preserve ordinality, with their combination delivering the strongest gains and faster convergence, while incurring minimal computational overhead. This approach provides a principled path to improved generalization in deep regression by enforcing both global and local structure in learned representations.

Abstract

For deep regression, preserving the ordinality of the targets with respect to the feature representation improves performance across various tasks. However, a theoretical explanation for the benefits of ordinality is still lacking. This work reveals that preserving ordinality reduces the conditional entropy of representation conditional on the target . However, our findings reveal that typical regression losses do little to reduce , even though it is vital for generalization performance. With this motivation, we introduce an optimal transport-based regularizer to preserve the similarity relationships of targets in the feature space to reduce . Additionally, we introduce a simple yet efficient strategy of duplicating the regressor targets, also with the aim of reducing . Experiments on three real-world regression tasks verify the effectiveness of our strategies to improve deep regression. Code: https://github.com/needylove/Regression_tightness.

Paper Structure

This paper contains 26 sections, 3 theorems, 22 equations, 8 figures, 6 tables.

Key Result

Theorem 1

Let ${\mathcal{B}}({\bf z},\epsilon) = \{{\bf z}' \in {\mathcal{Z}}| d({\bf z}, {\bf z}') \leq \epsilon \}$ be the closed ball center at ${\bf z}$ with radius $\epsilon$. Assume that $\forall ({\bf x}, {\bf z}, y) \in {\mathcal{P}}$ and $\forall \epsilon >0, \exists ({\bf x}', {\bf z}', y') \in {\ma

Figures (8)

  • Figure 1: Illustration of the MT strategy. Changing the target from $y$ to $[y, y]$ will introduce an additional regressor to predict the additional $y$. The original solution space $S_{y_0}$ is a line in the feature manifold. The additional $y$ introduces a new constraint, tightening $S_{y_0}$ from a line to a point.
  • Figure 2: Visualization of the feature manifolds, which shows that ${\mathcal{L}}_{ot}$ preserves the local similarity relationships of the target space.
  • Figure 3: Visualizations of the feature manifold on NYUD2-DIR for depth estimation. Preserving the ordinality (+ RankSim) has an effect similar to MT, which explicitly tightens the representations.
  • Figure 4: (a) Visualization of the ${\bf z}$ update, which aligns with $\theta$, (b) $\bm{\theta}$ update, which is steady through the training process, (c) the updating directions of $\bm{\theta}$s, which distributed along a line, with the original as the center.
  • Figure 5: Feature similarity matrices (Eucildean Distance). Tightening the representations results in a better ordinality.
  • ...and 3 more figures

Theorems & Definitions (4)

  • Definition 1
  • Theorem 1
  • Theorem 2
  • Lemma 1