Relative Representations: Topological and Geometric Perspectives

Alejandro García-Castellanos; Giovanni Luca Marchetti; Danica Kragic; Martina Scolamiero

Relative Representations: Topological and Geometric Perspectives

Alejandro García-Castellanos, Giovanni Luca Marchetti, Danica Kragic, Martina Scolamiero

TL;DR

The paper tackles zero-shot model stitching by examining latent-space representations across networks. It introduces a robust relative transformation that normalizes for activation-induced intertwiner symmetries via batch normalization, achieving invariance to non-isotropic rescalings and permutations, and couples this with a topological densification regularizer to encourage compact, class-wide clusters in the latent space. Empirical evaluation on cross-language NLP tasks (e.g., English–French stitching) demonstrates that both the robust transformation and the topological regularizer substantially improve zero-shot transfer performance over prior relative representations. The work highlights the practical potential of combining geometric invariance with topological structure in representation spaces, while outlining avenues for extending invariances to broader isometries and higher-dimensional persistent-homology regularizers in future work.

Abstract

Relative representations are an established approach to zero-shot model stitching, consisting of a non-trainable transformation of the latent space of a deep neural network. Based on insights of topological and geometric nature, we propose two improvements to relative representations. First, we introduce a normalization procedure in the relative transformation, resulting in invariance to non-isotropic rescalings and permutations. The latter coincides with the symmetries in parameter space induced by common activation functions. Second, we propose to deploy topological densification when fine-tuning relative representations, a topological regularization loss encouraging clustering within classes. We provide an empirical investigation on a natural language task, where both the proposed variations yield improved performance on zero-shot model stitching.

Relative Representations: Topological and Geometric Perspectives

TL;DR

Abstract

Paper Structure (19 sections, 2 theorems, 11 equations, 8 figures, 2 tables)

This paper contains 19 sections, 2 theorems, 11 equations, 8 figures, 2 tables.

Introduction and Related Work
Geometric Perspective.
Topological Perspective.
Background
Relative Representations
Symmetry Groups of Activation Functions
Topological Densification
Method
Robust Relative Transformation
Topological Densification of Relative Representations
Experiments
Data.
Models and Training.
Evaluation Metrics.
Comparison of Relative Transformations
...and 4 more sections

Key Result

Proposition 2.1

For each $1 \leq i < l$ pick $A_i \in G_{\sigma}^{n_i}$, and consider Then for each $m$: In particular, $f(x, \widetilde{W} ) = f(x, W)$ for all $x \in \mathbb{R}^{n_0}$.

Figures (8)

Figure 1: Cross-domain model stitching.
Figure 2: Effect of topological densification.
Figure 3: Different topological regularization setups for the relative transformation.
Figure 4: Non-cluster-preserving relative transformation example
Figure 5: Distribution of death times when training with post-relative topological densification on the English dataset.
...and 3 more figures

Theorems & Definitions (8)

Definition 2.1: moschella_relative_2022
Definition 2.2
Proposition 2.1: godfrey_symmetries_2023
Definition 2.3
Definition 2.4: hofer_densified_2021
Definition 3.1
Proposition 3.1
proof

Relative Representations: Topological and Geometric Perspectives

TL;DR

Abstract

Relative Representations: Topological and Geometric Perspectives

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (8)