Mathematics of Neural Networks (Lecture Notes Graduate Course)
Bart M. N. Smets
TL;DR
These notes address the mathematical foundations of neural networks from a graduate-math perspective, formalizing supervised learning via $\hat{R}(w)=\frac{1}{N}\sum_{i=1}^N \ell(F(x_i;w),y_i)$ and exploring how regularization, initialization, and architectures influence generalization. They build from basic feed-forward models to deep networks, discuss training with SGD and momentum, and detail CNNs, backpropagation, and adaptive optimizers such as Adagrad, RMSProp, and Adam. A central contribution is the development of an equivariant framework based on Lie groups and homogeneous spaces to design rotation-translation equivariant CNNs via lifting, group convolutions, and projections, with explicit treatment of Haar measures and invariant integrals. The material connects classical geometry with modern architectures, offering a rigorous path to geometry-aware neural networks applicable in vision, physics-informed modeling, and beyond.
Abstract
These are the lecture notes that accompanied the course of the same name that I taught at the Eindhoven University of Technology from 2021 to 2023. The course is intended as an introduction to neural networks for mathematics students at the graduate level and aims to make mathematics students interested in further researching neural networks. It consists of two parts: first a general introduction to deep learning that focuses on introducing the field in a formal mathematical way. The second part provides an introduction to the theory of Lie groups and homogeneous spaces and how it can be applied to design neural networks with desirable geometric equivariances. The lecture notes were made to be as self-contained as possible so as to accessible for any student with a moderate mathematics background. The course also included coding tutorials and assignments in the form of a set of Jupyter notebooks that are publicly available at https://gitlab.com/bsmetsjr/mathematics_of_neural_networks.
