Multilinear Operator Networks

Yixin Cheng; Grigorios G. Chrysos; Markos Georgopoulos; Volkan Cevher

Multilinear Operator Networks

Yixin Cheng, Grigorios G. Chrysos, Markos Georgopoulos, Volkan Cevher

TL;DR

MONet introduces a fully multilinear neural network built from Mu-Layers that capture multiplicative, high-order interactions within input tokens, enabling activation-free polynomial expansions to approach modern architectures. By stacking Poly-Blocks and employing pyramid patch embedding, MONet achieves strong performance on ImageNet1K and other benchmarks while maintaining favorable compute costs compared to prior polynomial networks. The model also demonstrates interpretability through a Poly Neural ODE solver that recovers symbolic Lotka-Volterra dynamics, and exhibits robustness on ImageNet-C. Overall, MONet suggests a viable activation-free alternative with competitive accuracy and potential broad applicability beyond vision tasks, though a complete theoretical characterization remains an open area for future work.

Abstract

Despite the remarkable capabilities of deep neural networks in image recognition, the dependence on activation functions remains a largely unexplored area and has yet to be eliminated. On the other hand, Polynomial Networks is a class of models that does not require activation functions, but have yet to perform on par with modern architectures. In this work, we aim close this gap and propose MONet, which relies solely on multilinear operators. The core layer of MONet, called Mu-Layer, captures multiplicative interactions of the elements of the input token. MONet captures high-degree interactions of the input elements and we demonstrate the efficacy of our approach on a series of image recognition and scientific computing benchmarks. The proposed model outperforms prior polynomial networks and performs on par with modern architectures. We believe that MONet can inspire further research on models that use entirely multilinear operations.

Multilinear Operator Networks

TL;DR

Abstract

Paper Structure (30 sections, 2 theorems, 18 equations, 15 figures, 18 tables)

This paper contains 30 sections, 2 theorems, 18 equations, 15 figures, 18 tables.

Introduction
Related work
Method
Mu-Layer
Network Architecture
Experiments
ImageNet1K Classification
Additional benchmarks in image recognition
Poly Neural ODE Solver
Robustness
Ablation Study
Conclusion
Proofs
Proof of \ref{['proposition:poly_mixer_multiplicative_interaction_per_layer']}
Interactions of the Poly-Block
...and 15 more sections

Key Result

Proposition 1

The Mu-Layer captures multiplicative interactions between elements of each token.

Figures (15)

Figure 1: The architecture of the proposed Mu-Layer (on the left) and MONet (on the right). In the left figure, the grey box represents layer normalization. The color solid line boxes represent channel projection in different dimensions, all projection operations are linear. The $\ast$ box denotes an elementwise (Hadamard) product. The red dash box represents the spatial aggregation module.
Figure 2: The training loss change with epochs trained(Left). The ground truth and model predicted trajectory. (Right) Our model achieves low loss in 20 epochs and successfully predicts real trajectory.
Figure 3: The Schematic of (simple) MONet and Multi-stage MONet. PPE represents our pyramid patch embedding.
Figure 4: The Schematic of Mu-Layer. Blue boxes correspond to learnable parameters. Green and red boxes denote input and output, respectively. The $\ast$ denotes the Hadamard product, the $+$ denotes element-wise addition. The gray box denotes the spatial aggregation module, the dotted line represents it as an optional module. In our design the first Mu-Layer of each Poly-Block includes a spatial aggregation unit, while the second Mu-Layer does not.
Figure 5: Pyramid Patch Embedding
...and 10 more figures

Theorems & Definitions (3)

Proposition 1
Proposition 2
proof

Multilinear Operator Networks

TL;DR

Abstract

Multilinear Operator Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (15)

Theorems & Definitions (3)