Approximation Power of Deep Neural Networks: an explanatory mathematical survey
Owen Davis, Mohammad Motamed
TL;DR
This paper surveys the approximation capabilities of deep neural networks, focusing on the expressive power of feed-forward and residual architectures and their formulation as optimization problems. It synthesizes classical density results (e.g., Weierstrass and Pinkus) with modern depth-based theories, showing that deep ReLU networks and deep Fourier networks achieve favorable error–complexity trade-offs, including exponential convergence for certain self-similar targets. The work also provides concrete error estimates for Fourier and ReLU networks, connects network width and depth to approximation quality, and illustrates both theoretical and numerical insights through structured examples. Overall, it establishes a rigorous mathematical foundation for understanding when and why deep networks can outperform traditional approximation methods on bounded, potentially irregular targets, while outlining key open questions such as dimensionality effects and spectral bias.
Abstract
This survey provides an in-depth and explanatory review of the approximation properties of deep neural networks, with a focus on feed-forward and residual architectures. The primary objective is to examine how effectively neural networks approximate target functions and to identify conditions under which they outperform traditional approximation methods. Key topics include the nonlinear, compositional structure of deep networks and the formalization of neural network tasks as optimization problems in regression and classification settings. The survey also addresses the training process, emphasizing the role of stochastic gradient descent and backpropagation in solving these optimization problems, and highlights practical considerations such as activation functions, overfitting, and regularization techniques. Additionally, the survey explores the density of neural networks in the space of continuous functions, comparing the approximation capabilities of deep ReLU networks with those of other approximation methods. It discusses recent theoretical advancements in understanding the expressiveness and limitations of these networks. A detailed error-complexity analysis is also presented, focusing on error rates and computational complexity for neural networks with ReLU and Fourier-type activation functions in the context of bounded target functions with minimal regularity assumptions. Alongside recent known results, the survey introduces new findings, offering a valuable resource for understanding the theoretical foundations of neural network approximation. Concluding remarks and further reading suggestions are provided.
