Unraveling the Black Box of Neural Networks: A Dynamic Extremum Mapper
Shengjian Chen
TL;DR
By reframing neural networks as systems that generalize through dynamic mapping to the extrema of the model function, the paper provides a mathematical lens on generalization. It introduces the extremum-increment (EI) algorithm, which derives parameters by solving homogeneous linear systems and updates layers iteratively, contrasting with backpropagation. The analysis connects extrema dynamics to gradient vanishing/explosion, overfitting, and robustness via weakened termination and surface-neighborhood concepts, offering practical relaxations that can complement gradient-based methods. Overall, the work lays a foundational framework that could inform alternative training paradigms and deepen theoretical understanding of neural network generalization.
Abstract
We point out that neural networks are not black boxes, and their generalization stems from the ability to dynamically map a dataset to the extrema of the model function. We further prove that the number of extrema in a neural network is positively correlated with the number of its parameters. We then propose a new algorithm that is significantly different from back-propagation algorithm, which mainly obtains the values of parameters by solving a system of linear equations. Some difficult situations, such as gradient vanishing and overfitting, can be simply explained and dealt with in this framework.
