Deep Regression Representation Learning with Topology
Shihao Zhang, kenji kawaguchi, Angela Yao
TL;DR
This work bridges the Information Bottleneck framework with the topology of regression representations, showing that minimizing ${\mathcal{H}}({\bf Z}|{\bf Y})$ is linked to the intrinsic dimension of the feature space and that topological similarity to the target space improves IB alignment. It introduces PH-Reg, a regression-specific regularizer consisting of an intrinsic-dimension term ${\mathcal L}_d$ and a topology term ${\mathcal L}_t$, implemented via a topology autoencoder and persistent-homology-based distances. Through synthetic and real-world experiments (depth estimation, super-resolution, and age estimation), PH-Reg consistently improves regression performance, especially when both regularizers are used together, and provides ablations that validate the roles of ID control and topology preservation. The results highlight the practical value of incorporating topology-aware regularization into regression models to align feature spaces with target topologies and to reduce representation complexity.
Abstract
Most works studying representation learning focus only on classification and neglect regression. Yet, the learning objectives and, therefore, the representation topologies of the two tasks are fundamentally different: classification targets class separation, leading to disconnected representations, whereas regression requires ordinality with respect to the target, leading to continuous representations. We thus wonder how the effectiveness of a regression representation is influenced by its topology, with evaluation based on the Information Bottleneck (IB) principle. The IB principle is an important framework that provides principles for learning effective representations. We establish two connections between it and the topology of regression representations. The first connection reveals that a lower intrinsic dimension of the feature space implies a reduced complexity of the representation Z. This complexity can be quantified as the conditional entropy of Z on the target Y, and serves as an upper bound on the generalization error. The second connection suggests a feature space that is topologically similar to the target space will better align with the IB principle. Based on these two connections, we introduce PH-Reg, a regularizer specific to regression that matches the intrinsic dimension and topology of the feature space with the target space. Experiments on synthetic and real-world regression tasks demonstrate the benefits of PH-Reg. Code: https://github.com/needylove/PH-Reg.
