Extraction of nonlinearity in neural networks with Koopman operator
Naoki Sugishita, Kayo Kinjo, Jun Ohkubo
TL;DR
The paper addresses whether nonlinear activations are indispensable in neural networks by replacing intermediate nonlinear layers with a Koopman operator learned via EDMD. It demonstrates that a finite-dimensional Koopman matrix, aided by tensor-train representations, can mimic internal layer dynamics and maintain competitive accuracy under substantial compression. Key findings include that a modest number of singular values (≈10) capture the essential behavior and that Gaussian RBF dictionaries can yield effective surrogates, with similar results on MNIST and Fashion MNIST. This work advances a physics-inspired, data-driven framework for neural network compression and interpretability through linear representations of nonlinear dynamics.
Abstract
Nonlinearity plays a crucial role in deep neural networks. In this paper, we investigate the degree to which the nonlinearity of the neural network is essential. For this purpose, we employ the Koopman operator, extended dynamic mode decomposition, and the tensor-train format. The Koopman operator approach has been recently developed in physics and nonlinear sciences; the Koopman operator deals with the time evolution in the observable space instead of the state space. Since we can replace the nonlinearity in the state space with the linearity in the observable space, it is a hopeful candidate for understanding complex behavior in nonlinear systems. Here, we analyze learned neural networks for the classification problems. As a result, the replacement of the nonlinear middle layers with the Koopman matrix yields enough accuracy in numerical experiments. In addition, we confirm that the pruning of the Koopman matrix gives sufficient accuracy even at high compression ratios. These results indicate the possibility of extracting some features in the neural networks with the Koopman operator approach.
