Poly-MgNet: Polynomial Building Blocks in Multigrid-Inspired ResNets

Antonia van Betteray; Matthias Rottmann; Karsten Kahl

Poly-MgNet: Polynomial Building Blocks in Multigrid-Inspired ResNets

Antonia van Betteray, Matthias Rottmann, Karsten Kahl

TL;DR

The paper tackles the high parameter count of ResNets by embedding MG-inspired polynomial smoothers into MgNet to create Poly-MgNet, dramatically reducing weights while preserving accuracy. It replaces or augments standard smoothing with a polynomial operator $p_d(A)=\sum_{i=0}^d \alpha_i A^i$, leading to residual updates $u \leftarrow u + p_d(A)(f-Au)$ and a residual propagation factor $q_{d+1}(A)=I-Ap_d(A)$, whose roots are chosen from the spectrum to control convergence. Empirically, Poly-MgNet achieves strong accuracy with substantially fewer parameters on CIFAR-10 (e.g., Poly-MgNet$^{q_2}$ around $1.3$M weights vs ResNet18's $11.2$M) and demonstrates that real- and complex-root based polynomial blocks can further improve the accuracy–weight trade-off relative to ResNet and MgNet baselines. The work provides design guidelines for integrating MG smoothing into CNNs, including root selection, ReLU/bn placement, and channel scaling, illustrating that MG-inspired weight-sharing can yield competitive performance with much smaller models.

Abstract

The structural analogies of ResNets and Multigrid (MG) methods such as common building blocks like convolutions and poolings where already pointed out by He et al.\ in 2016. Multigrid methods are used in the context of scientific computing for solving large sparse linear systems arising from partial differential equations. MG methods particularly rely on two main concepts: smoothing and residual restriction / coarsening. Exploiting these analogies, He and Xu developed the MgNet framework, which integrates MG schemes into the design of ResNets. In this work, we introduce a novel neural network building block inspired by polynomial smoothers from MG theory. Our polynomial block from an MG perspective naturally extends the MgNet framework to Poly-Mgnet and at the same time reduces the number of weights in MgNet. We present a comprehensive study of our polynomial block, analyzing the choice of initial coefficients, the polynomial degree, the placement of activation functions, as well as of batch normalizations. Our results demonstrate that constructing (quadratic) polynomial building blocks based on real and imaginary polynomial roots enhances Poly-MgNet's capacity in terms of accuracy. Furthermore, our approach achieves an improved trade-off of model accuracy and number of weights compared to ResNet as well as compared to specific configurations of MgNet.

Poly-MgNet: Polynomial Building Blocks in Multigrid-Inspired ResNets

TL;DR

Abstract

Poly-MgNet: Polynomial Building Blocks in Multigrid-Inspired ResNets

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)