Table of Contents
Fetching ...

Drawback of Enforcing Equivariance and its Compensation via the Lens of Expressive Power

Yuzhu Chen, Tian Qin, Xinmei Tian, Fengxiang He, Dacheng Tao

TL;DR

This work theoretically analyzes how enforcing equivariance affects expressive power in two-layer ReLU networks, revealing that layer-wise equivariance can strictly limit expressivity in some regimes. It introduces boundary hyperplanes and symmetric channel vectors to connect symmetry constraints to network geometry, showing GENs inherently require symmetric hyperplanes while LENs enforce symmetric channel vectors. The authors prove that enlarging model size by a factor |G| can compensate for expressivity losses and, in some cases, yield a lower-complexity hypothesis space, suggesting better generalization. The findings provide a principled guide for employing layer-wise equivariance, highlighting a trade-off between expressivity and model size with potential generalization benefits.

Abstract

Equivariant neural networks encode symmetry as an inductive bias and have achieved strong empirical performance in wide domains. However, their expressive power remains not well understood. Focusing on 2-layer ReLU networks, this paper investigates the impact of equivariance constraints on the expressivity of equivariant and layer-wise equivariant networks. By examining the boundary hyperplanes and the channel vectors of ReLU networks, we construct an example showing that equivariance constraints could strictly limit expressive power. However, we demonstrate that this drawback can be compensated via enlarging the model size. Furthermore, we show that despite a larger model size, the resulting architecture could still correspond to a hypothesis space with lower complexity, implying superior generalizability for equivariant networks.

Drawback of Enforcing Equivariance and its Compensation via the Lens of Expressive Power

TL;DR

This work theoretically analyzes how enforcing equivariance affects expressive power in two-layer ReLU networks, revealing that layer-wise equivariance can strictly limit expressivity in some regimes. It introduces boundary hyperplanes and symmetric channel vectors to connect symmetry constraints to network geometry, showing GENs inherently require symmetric hyperplanes while LENs enforce symmetric channel vectors. The authors prove that enlarging model size by a factor |G| can compensate for expressivity losses and, in some cases, yield a lower-complexity hypothesis space, suggesting better generalization. The findings provide a principled guide for employing layer-wise equivariance, highlighting a trade-off between expressivity and model size with potential generalization benefits.

Abstract

Equivariant neural networks encode symmetry as an inductive bias and have achieved strong empirical performance in wide domains. However, their expressive power remains not well understood. Focusing on 2-layer ReLU networks, this paper investigates the impact of equivariance constraints on the expressivity of equivariant and layer-wise equivariant networks. By examining the boundary hyperplanes and the channel vectors of ReLU networks, we construct an example showing that equivariance constraints could strictly limit expressive power. However, we demonstrate that this drawback can be compensated via enlarging the model size. Furthermore, we show that despite a larger model size, the resulting architecture could still correspond to a hypothesis space with lower complexity, implying superior generalizability for equivariant networks.

Paper Structure

This paper contains 28 sections, 8 theorems, 50 equations, 2 figures.

Key Result

Theorem 4.2

Let $F=\sum_{i=1}^m\beta_i\sigma(\langle\alpha_i,x\rangle)$ be an output function. Then, a hyperplane $M$ is a boundary hyperplane if and only for all $v\not\in M$: where $\langle\alpha_i,M\rangle=0$ are defined as $\langle\alpha_i,x\rangle=0$ for all $x\in M$. Especially, if $M$ is a boundary hyperplane, there exists some channel vector $\alpha$ such that $\langle \alpha,M\rangle=0$.

Figures (2)

  • Figure 1: An equivariant function that satisfies $s((a,b)^\top)=s((b,a)^\top)$ and $s((a,b)^\top)=s((-a,-b)^\top)$. The left subfigure is a 3D plot, while the right subfigure is a 2D Contour map.
  • Figure 2: A visualization of the feature function $\mathcal{F}$ of $F(x,y)=\sigma(x)+\sigma(-y)+\sigma(-x+y)$, where the left subfigure is $\mathcal{F}_1$ and the right figure is $\mathcal{F}_2$. As shown, there are two boundary hyperplanes: $x=0$, $y=0$, $y=x$.

Theorems & Definitions (12)

  • Definition 4.1: boundary hyperplane
  • Theorem 4.2
  • Theorem 4.3
  • Lemma 4.4
  • proof
  • Lemma 4.5
  • proof
  • Theorem 4.6
  • Example 5.1
  • Lemma 6.1
  • ...and 2 more