Several Supporting Evidences for the Adaptive Feature Program
Yicheng Li, Qian Lin
TL;DR
The paper tackles the theoretical understanding of neural network generalization by proposing the Adaptive Feature Program (AFP), which jointly learns feature representations and linear readouts through gradient flow. It introduces the Feature Error Measure (FEM) to quantify how well learned features align with the target function, and models training via overparametrized sequence models justified by Le Cam equivalence. The authors develop detailed analyses for diagonal fixed-basis and directional single-index and multi-index models, showing FEM decreases and often achieves near-optimal nonparametric rates, with clear phase dynamics and dependencies on information indices like $\rz$. They also establish path-equivalence results linking sequence-model dynamics to empirical-loss dynamics, supported by numerical studies. Overall, the work provides a unified framework that connects classical statistical understanding with modern feature-learning dynamics, offering insights into how adaptive representations can improve generalization in high-dimensional regimes.
Abstract
Theoretically exploring the advantages of neural networks might be one of the most challenging problems in the AI era. An adaptive feature program has recently been proposed to analyze the feature learning characteristic property of neural networks in a more abstract way. Motivated by the celebrated Le Cam equivalence, we advocate the over-parametrized sequence models to further simplify the analysis of the training dynamics of adaptive feature program and present several supporting evidences for the adaptive feature program. More precisely, after having introduced the feature error measure (FEM) to characterize the quality of the learned feature, we show that the FEM is decreasing during the training process of several concrete adaptive feature models including linear regression, single/multiple index models, etc. We believe that this hints at the potential successes of the adaptive feature program.
