$\boldsymbolλ$-Orthogonality Regularization for Compatible Representation Learning
Simone Ricci, Niccolò Biondi, Federico Pernici, Ioannis Patras, Alberto Del Bimbo
TL;DR
This paper tackles the challenge of backward-compatible representations in retrieval systems when independently trained models diverge. It introduces a framework that jointly learns a backward orthogonal/adaptive map and a forward affine transform, augmented by a novel λ-Orthogonality regularization that smoothly enforces near-orthogonality with a tunable threshold $\lambda$. A supervised contrastive loss and a partial backfilling strategy further enhance alignment and efficient gallery updates. Empirically, the method preserves zero-shot performance while achieving strong backward compatibility and improved downstream task accuracy across diverse architectures and datasets. The approach provides a practical, architecture-agnostic solution for maintaining consistent retrieval performance during model updates and can be extended to broader representation-adaptation settings.
Abstract
Retrieval systems rely on representations learned by increasingly powerful models. However, due to the high training cost and inconsistencies in learned representations, there is significant interest in facilitating communication between representations and ensuring compatibility across independently trained neural networks. In the literature, two primary approaches are commonly used to adapt different learned representations: affine transformations, which adapt well to specific distributions but can significantly alter the original representation, and orthogonal transformations, which preserve the original structure with strict geometric constraints but limit adaptability. A key challenge is adapting the latent spaces of updated models to align with those of previous models on downstream distributions while preserving the newly learned representation spaces. In this paper, we impose a relaxed orthogonality constraint, namely $λ$-Orthogonality regularization, while learning an affine transformation, to obtain distribution-specific adaptation while retaining the original learned representations. Extensive experiments across various architectures and datasets validate our approach, demonstrating that it preserves the model's zero-shot performance and ensures compatibility across model updates. Code available at: \href{https://github.com/miccunifi/lambda_orthogonality.git}{https://github.com/miccunifi/lambda\_orthogonality}.
