Mean-Field Limits for Two-Layer Neural Networks Trained with Consensus-Based Optimization

William De Deyn; Michael Herty; Giovanni Samaey

Mean-Field Limits for Two-Layer Neural Networks Trained with Consensus-Based Optimization

William De Deyn, Michael Herty, Giovanni Samaey

TL;DR

This work investigates Consensus-Based Optimization (CBO) as a gradient-free training paradigm for two-layer neural networks and benchmarks it against Adam, introducing a hybrid Adam–CBO variant that accelerates convergence and a Multi-Task CBO variant that reduces memory overhead. It develops a mean-field framework by reformulating CBO in optimal-transport terms and analyzes both the infinite-width limit and the infinite-particle limit, proving variance decay and consensus. Empirical results on sine-approximation, MNIST, and multi-task settings demonstrate competitive performance, robustness, and scalability advantages of the proposed methods. The combination of OT-based mean-field analysis and practical CBO variants provides a principled pathway for scalable, global-optimization–oriented training of wide neural networks.

Abstract

We study Consensus-Based Optimization (CBO) for two-layer neural network training. We compare the performance of CBO against Adam on two test cases and demonstrate how a hybrid approach, combining CBO with Adam, provides faster convergence than CBO. Additionally, in the context of multi-task learning, we recast CBO into a formulation that offers less memory overhead. The CBO method allows for a mean-field limit formulation, which we couple with the mean-field limit of the neural network. To this end, we first reformulate CBO within the optimal transport framework. In the limit of infinitely many particles, we define the corresponding dynamics on the Wasserstein-over-Wasserstein space and show that the variance decreases monotonically.

Mean-Field Limits for Two-Layer Neural Networks Trained with Consensus-Based Optimization

TL;DR

Abstract

Mean-Field Limits for Two-Layer Neural Networks Trained with Consensus-Based Optimization

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (6)