Efficient Model-Agnostic Multi-Group Equivariant Networks
Razan Baltaji, Sourya Basu, Lav R. Varshney
TL;DR
The paper addresses the high computational cost of model-agnostic group equivariant networks when handling large product groups and multiple inputs. It introduces two efficient designs: (i) a multi-input architecture with an invariant-symmetric (IS) fusion layer that characterizes and leverages the linear equivariant space, extendable to nonlinear models, and (ii) a large-product-group design for single-input cases that achieves equivariance with complexity $O(|G_1|+\cdots+|G_N|)$ instead of the naively exponential $|G_1|\cdots|G_N|$. The IS layer is shown to be a universal approximator of invariant-symmetric functions, and the large-product design achieves comparable performance to equitune with substantially lower compute. Empirically, the methods are validated on multi-image classification, SCAN-II compositional language tasks, intersectional fairness in NLG, and robust CLIP-based classification, demonstrating competitive results and meaningful efficiency gains for practical deployment.
Abstract
Constructing model-agnostic group equivariant networks, such as equitune (Basu et al., 2023b) and its generalizations (Kim et al., 2023), can be computationally expensive for large product groups. We address this problem by providing efficient model-agnostic equivariant designs for two related problems: one where the network has multiple inputs each with potentially different groups acting on them, and another where there is a single input but the group acting on it is a large product group. For the first design, we initially consider a linear model and characterize the entire equivariant space that satisfies this constraint. This characterization gives rise to a novel fusion layer between different channels that satisfies an invariance-symmetry (IS) constraint, which we call an IS layer. We then extend this design beyond linear models, similar to equitune, consisting of equivariant and IS layers. We also show that the IS layer is a universal approximator of invariant-symmetric functions. Inspired by the first design, we use the notion of the IS property to design a second efficient model-agnostic equivariant design for large product groups acting on a single input. For the first design, we provide experiments on multi-image classification where each view is transformed independently with transformations such as rotations. We find equivariant models are robust to such transformations and perform competitively otherwise. For the second design, we consider three applications: language compositionality on the SCAN dataset to product groups; fairness in natural language generation from GPT-2 to address intersectionality; and robust zero-shot image classification with CLIP. Overall, our methods are simple and general, competitive with equitune and its variants, while also being computationally more efficient.
