Local Linear Recovery Guarantee of Deep Neural Networks at Overparameterization
Yaoyu Zhang, Leyang Zhang, Zhongwang Zhang, Zhiwei Bai
TL;DR
The paper addresses data-efficient recovery of target functions by overparameterized deep neural networks. It introduces Local Linear Recovery (LLR) and the concept of optimistic sample size to quantify the best-possible data requirements, connecting recoverability to the model's tangent-space rank. Using Embedding Principles and critical mappings, it derives upper bounds on optimistic sample sizes for general DNNs, and provides exact results for two-layer tanh networks and related CNN architectures, showing that recovery can occur with far fewer samples than the total number of parameters. The work clarifies how architecture and width influence data efficiency, sets a foundation for stronger recovery guarantees, and suggests directions for extending these concepts to deeper networks.
Abstract
Determining whether deep neural network (DNN) models can reliably recover target functions at overparameterization is a critical yet complex issue in the theory of deep learning. To advance understanding in this area, we introduce a concept we term "local linear recovery" (LLR), a weaker form of target function recovery that renders the problem more amenable to theoretical analysis. In the sense of LLR, we prove that functions expressible by narrower DNNs are guaranteed to be recoverable from fewer samples than model parameters. Specifically, we establish upper limits on the optimistic sample sizes, defined as the smallest sample size necessary to guarantee LLR, for functions in the space of a given DNN. Furthermore, we prove that these upper bounds are achieved in the case of two-layer tanh neural networks. Our research lays a solid groundwork for future investigations into the recovery capabilities of DNNs in overparameterized scenarios.
