Relative Representations: Topological and Geometric Perspectives
Alejandro García-Castellanos, Giovanni Luca Marchetti, Danica Kragic, Martina Scolamiero
TL;DR
The paper tackles zero-shot model stitching by examining latent-space representations across networks. It introduces a robust relative transformation that normalizes for activation-induced intertwiner symmetries via batch normalization, achieving invariance to non-isotropic rescalings and permutations, and couples this with a topological densification regularizer to encourage compact, class-wide clusters in the latent space. Empirical evaluation on cross-language NLP tasks (e.g., English–French stitching) demonstrates that both the robust transformation and the topological regularizer substantially improve zero-shot transfer performance over prior relative representations. The work highlights the practical potential of combining geometric invariance with topological structure in representation spaces, while outlining avenues for extending invariances to broader isometries and higher-dimensional persistent-homology regularizers in future work.
Abstract
Relative representations are an established approach to zero-shot model stitching, consisting of a non-trainable transformation of the latent space of a deep neural network. Based on insights of topological and geometric nature, we propose two improvements to relative representations. First, we introduce a normalization procedure in the relative transformation, resulting in invariance to non-isotropic rescalings and permutations. The latter coincides with the symmetries in parameter space induced by common activation functions. Second, we propose to deploy topological densification when fine-tuning relative representations, a topological regularization loss encouraging clustering within classes. We provide an empirical investigation on a natural language task, where both the proposed variations yield improved performance on zero-shot model stitching.
