Gradient Flow Matching for Learning Update Dynamics in Neural Network Training

Xiao Shou; Yanna Ding; Jianxi Gao

Gradient Flow Matching for Learning Update Dynamics in Neural Network Training

Xiao Shou, Yanna Ding, Jianxi Gao

TL;DR

This paper introduces Gradient Flow Matching (GFM), a continuous-time framework that models neural network training as an optimizer-aware dynamical system using vector fields learned via conditional flow matching. By explicitly incorporating gradient-based update dynamics, GFM can forecast final converged weights from partial training sequences and extend to momentum and adaptive optimizers. Empirical results show GFM outperforms standard sequence models (e.g., LSTM) and approaches Transformer performance across synthetic and real-world settings (including CIFAR-10), while generalizing across architectures. This approach offers a principled, scalable method for predicting optimization trajectories, with potential to accelerate convergence prediction and optimization research by bridging continuous-time modeling with practical forecasting tasks.

Abstract

Training deep neural networks remains computationally intensive due to the itera2 tive nature of gradient-based optimization. We propose Gradient Flow Matching (GFM), a continuous-time modeling framework that treats neural network training as a dynamical system governed by learned optimizer-aware vector fields. By leveraging conditional flow matching, GFM captures the underlying update rules of optimizers such as SGD, Adam, and RMSprop, enabling smooth extrapolation of weight trajectories toward convergence. Unlike black-box sequence models, GFM incorporates structural knowledge of gradient-based updates into the learning objective, facilitating accurate forecasting of final weights from partial training sequences. Empirically, GFM achieves forecasting accuracy that is competitive with Transformer-based models and significantly outperforms LSTM and other classical baselines. Furthermore, GFM generalizes across neural architectures and initializations, providing a unified framework for studying optimization dynamics and accelerating convergence prediction.

Gradient Flow Matching for Learning Update Dynamics in Neural Network Training

TL;DR

Abstract

Gradient Flow Matching for Learning Update Dynamics in Neural Network Training

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (4)