Improved Canonicalization for Model Agnostic Equivariance

Siba Smarak Panigrahi; Arnab Kumar Mondal

Improved Canonicalization for Model Agnostic Equivariance

Siba Smarak Panigrahi, Arnab Kumar Mondal

Abstract

This work introduces a novel approach to achieving architecture-agnostic equivariance in deep learning, particularly addressing the limitations of traditional layerwise equivariant architectures and the inefficiencies of the existing architecture-agnostic methods. Building equivariant models using traditional methods requires designing equivariant versions of existing models and training them from scratch, a process that is both impractical and resource-intensive. Canonicalization has emerged as a promising alternative for inducing equivariance without altering model architecture, but it suffers from the need for highly expressive and expensive equivariant networks to learn canonical orientations accurately. We propose a new optimization-based method that employs any non-equivariant network for canonicalization. Our method uses contrastive learning to efficiently learn a canonical orientation and offers more flexibility for the choice of canonicalization network. We empirically demonstrate that this approach outperforms existing methods in achieving equivariance for large pretrained models and significantly speeds up the canonicalization process, making it up to 2 times faster.

Improved Canonicalization for Model Agnostic Equivariance

Abstract

Paper Structure (17 sections, 8 equations, 2 figures, 2 tables)

This paper contains 17 sections, 8 equations, 2 figures, 2 tables.

Introduction
Background
Formulation
Canonicalization Function
Prior Regularization
Method
Results
Image Classification
Experiment Setup.
Evaluation setup.
Results.
Zero-shot Instance Segmentation
Experiment Setup.
Evaluation setup.
Results.
...and 2 more sections

Figures (2)

Figure 1: Learning equivariant canonicalizer with a non-equivariant canonicalization network. All the transformations of the group are applied to the input image and passed through the canonicalization network in parallel. A dot product of the output of the canonicalization network with a reference vector gives us a distribution over the transformations to canonicalize the input. We also minimize the similarity between the vectors to get a unique canonical orientation.
Figure 2: Identity metric vs. Relative wall-time (in minutes). We define the identity metric as the percentage of input images mapped to the identity group element $e$, which is our prior distribution $\mathbb{P}_{c(x)}$. This figure demonstrates that our EquiOptAdapt is able to learn the prior faster than EquiAdapt.

Improved Canonicalization for Model Agnostic Equivariance

Abstract

Improved Canonicalization for Model Agnostic Equivariance

Authors

Abstract

Table of Contents

Figures (2)