Backward-Compatible Aligned Representations via an Orthogonal Transformation Layer

Simone Ricci; Niccolò Biondi; Federico Pernici; Alberto Del Bimbo

Backward-Compatible Aligned Representations via an Orthogonal Transformation Layer

Simone Ricci, Niccolò Biondi, Federico Pernici, Alberto Del Bimbo

TL;DR

This work tackles the cost of updating visual retrieval systems by ensuring backward compatibility between old and new representations. It introduces the Orthogonal Compatible Aligned (OCA) method, which expands the feature space with extra dimensions and learns an orthogonal transformation to preserve the geometry of the old representation while incorporating new information. ACE loss aligns new features with fixed old prototypes, and the training-time orthogonal transform enables learning in the expanded space; at inference, only the original representation is used. Empirical results on CIFAR-100 and ImageNet-1k show state-of-the-art accuracy alongside robust backward compatibility, with fewer additional parameters than competing space-expansion methods, reducing re-indexing costs in model updates.

Abstract

Visual retrieval systems face significant challenges when updating models with improved representations due to misalignment between the old and new representations. The costly and resource-intensive backfilling process involves recalculating feature vectors for images in the gallery set whenever a new model is introduced. To address this, prior research has explored backward-compatible training methods that enable direct comparisons between new and old representations without backfilling. Despite these advancements, achieving a balance between backward compatibility and the performance of independently trained models remains an open problem. In this paper, we address it by expanding the representation space with additional dimensions and learning an orthogonal transformation to achieve compatibility with old models and, at the same time, integrate new information. This transformation preserves the original feature space's geometry, ensuring that our model aligns with previous versions while also learning new data. Our Orthogonal Compatible Aligned (OCA) approach eliminates the need for re-indexing during model updates and ensures that features can be compared directly across different model updates without additional mapping functions. Experimental results on CIFAR-100 and ImageNet-1k demonstrate that our method not only maintains compatibility with previous models but also achieves state-of-the-art accuracy, outperforming several existing methods.

Backward-Compatible Aligned Representations via an Orthogonal Transformation Layer

TL;DR

Abstract

Paper Structure (13 sections, 7 equations, 2 figures, 4 tables)

This paper contains 13 sections, 7 equations, 2 figures, 4 tables.

Introduction
Related Works
Methodology
Backward-Compatible Training
Backward-Compatibility via Representations Alignment and Orthogonal Transformation
Experimental Results
Datasets
Evaluation Metrics
Compared Methods
Implementations Details
Experimental Results
Ablation Studies
Conclusion

Figures (2)

Figure 1: Overview of our method. The DNN backbone generates representations in a feature space $h_{\rm new}$. This feature space is divided into two different parts: $h_{\rm btc}$ is the learned compatible representation space according to $\mathcal{L}_{\rm ACE}$, while $h_{\rm e}$ is an extra feature space used to learn new information from new data without negatively affecting the old feature space configuration. $h_{\rm new} = [h_{\rm bct}|h_{\rm e}]$ is then transformed with $T_{{\pmb{\perp}}}$ into $h_{{\pmb{\perp}}}$ and then used for classification using $\mathcal{L}_{\rm CE}$.
Figure 2: Overview of our method at inference time. The DNN backbone model produces representations within a feature space $h_{\rm new}$. This space is divided into two parts: $h_{\rm btc}$ is the compatible representation space. Its representations are used to perform visual search directly with the old gallery features without using the orthogonal transformation function that we discard after training. Representations $h_{\rm new} = [h_{\rm bct}|h_{\rm e}]$ are instead used to match with the updated gallery to be as close as possible to the performance of the independently trained version of the new model.

Theorems & Definitions (1)

Definition 1: Backward Compatibility shen2020towards

Backward-Compatible Aligned Representations via an Orthogonal Transformation Layer

TL;DR

Abstract

Backward-Compatible Aligned Representations via an Orthogonal Transformation Layer

Authors

TL;DR

Abstract

Table of Contents

Figures (2)

Theorems & Definitions (1)