Evidence, Definitions and Algorithms regarding the Existence of Cohesive-Convergence Groups in Neural Network Optimization
Thien An L. Nguyen
TL;DR
This work addresses the challenge of understanding neural-network convergence in non-convex settings by introducing cohesive-convergence groups and generative groups as formal constructs. It defines precise conditions under which groups of data form cohesive convergence and presents two algorithms to quantify convergence-cohesion degrees and test-side unconditional cohesive-degrees. Through CIFAR-10 experiments with a ResNet18 trained by SGD, the authors demonstrate the existence of cohesive-convergence structure and reveal its relation to label information and the bias-variance trade-off. The findings offer a new analytical lens for optimization dynamics and point to future research on how minimal cohesive-convergence groups could span an entire dataset.
Abstract
Understanding the convergence process of neural networks is one of the most complex and crucial issues in the field of machine learning. Despite the close association of notable successes in this domain with the convergence of artificial neural networks, this concept remains predominantly theoretical. In reality, due to the non-convex nature of the optimization problems that artificial neural networks tackle, very few trained networks actually achieve convergence. To expand recent research efforts on artificial-neural-network convergence, this paper will discuss a different approach based on observations of cohesive-convergence groups emerging during the optimization process of an artificial neural network.
