Deep Learning with Parametric Lenses
Geoffrey S. H. Cruttwell, Bruno Gavranovic, Neil Ghani, Paul Wilson, Fabio Zanasi
TL;DR
This work introduces parametric lenses as a categorical foundation for gradient-based learning, unifying diverse optimisers, loss maps, and architectures under the para(lens) and CRDC framework. By combining Para(C) (parametric maps), Lens(C) (bidirectional data flow), and CRDC (reverse differentiation), the authors model neural layers, learning rates, and optimisers as composable lenses, applicable to real-valued and discrete domains, including Boolean circuits and GANs. The approach yields a uniform description of supervised and unsupervised learning (e.g., Wasserstein GANs) and even learning of inputs via deep dreaming, all implemented in a Python library that demonstrates practical gradient computation through lens composition. The framework aims to enable modular design, reasoning, and extension of learning systems, with future work targeting richer architectures, higher-order differentiation, and broader non-gradient settings.
Abstract
We propose a categorical semantics for machine learning algorithms in terms of lenses, parametric maps, and reverse derivative categories. This foundation provides a powerful explanatory and unifying framework: it encompasses a variety of gradient descent algorithms such as ADAM, AdaGrad, and Nesterov momentum, as well as a variety of loss functions such as MSE and Softmax cross-entropy, and different architectures, shedding new light on their similarities and differences. Furthermore, our approach to learning has examples generalising beyond the familiar continuous domains (modelled in categories of smooth maps) and can be realised in the discrete setting of Boolean and polynomial circuits. We demonstrate the practical significance of our framework with an implementation in Python.
