Multilinear Operator Networks
Yixin Cheng, Grigorios G. Chrysos, Markos Georgopoulos, Volkan Cevher
TL;DR
MONet introduces a fully multilinear neural network built from Mu-Layers that capture multiplicative, high-order interactions within input tokens, enabling activation-free polynomial expansions to approach modern architectures. By stacking Poly-Blocks and employing pyramid patch embedding, MONet achieves strong performance on ImageNet1K and other benchmarks while maintaining favorable compute costs compared to prior polynomial networks. The model also demonstrates interpretability through a Poly Neural ODE solver that recovers symbolic Lotka-Volterra dynamics, and exhibits robustness on ImageNet-C. Overall, MONet suggests a viable activation-free alternative with competitive accuracy and potential broad applicability beyond vision tasks, though a complete theoretical characterization remains an open area for future work.
Abstract
Despite the remarkable capabilities of deep neural networks in image recognition, the dependence on activation functions remains a largely unexplored area and has yet to be eliminated. On the other hand, Polynomial Networks is a class of models that does not require activation functions, but have yet to perform on par with modern architectures. In this work, we aim close this gap and propose MONet, which relies solely on multilinear operators. The core layer of MONet, called Mu-Layer, captures multiplicative interactions of the elements of the input token. MONet captures high-degree interactions of the input elements and we demonstrate the efficacy of our approach on a series of image recognition and scientific computing benchmarks. The proposed model outperforms prior polynomial networks and performs on par with modern architectures. We believe that MONet can inspire further research on models that use entirely multilinear operations.
