DiagrammaticLearning: A Graphical Language for Compositional Training Regimes
Mason Lary, Richard Samuelson, Alexander Wilentz, Alina Zare, Matthew Klawonn, James P. Fairbanks
TL;DR
Learning diagrams provide a graphical, category-theoretic language for composing multi-component training regimes, treating data, models, and their interactions as structured data and compiling them into a single composite loss. The authors formalize learning graphs and diagrams, implement DiagrammaticLearning.jl, and demonstrate NIC, knowledge distillation, and few-shot learning within this framework, including colimit-based composition for shared backbones. The approach yields rigorous semantics, a path-finding and loss-compilation pipeline, and library support that integrates with PyTorch and Flux, enabling modular, reusable, and scalable training workflows. This work offers a principled foundation for reproducible, compositional ML pipelines and points toward broader tools for model management and AutoML under a mathematics-based paradigm.
Abstract
Motivated by deep learning regimes with multiple interacting yet distinct model components, we introduce learning diagrams, graphical depictions of training setups that capture parameterized learning as data rather than code. A learning diagram compiles to a unique loss function on which component models are trained. The result of training on this loss is a collection of models whose predictions ``agree" with one another. We show that a number of popular learning setups such as few-shot multi-task learning, knowledge distillation, and multi-modal learning can be depicted as learning diagrams. We further implement learning diagrams in a library that allows users to build diagrams of PyTorch and Flux.jl models. By implementing some classic machine learning use cases, we demonstrate how learning diagrams allow practitioners to build complicated models as compositions of smaller components, identify relationships between workflows, and manipulate models during or after training. Leveraging a category theoretic framework, we introduce a rigorous semantics for learning diagrams that puts such operations on a firm mathematical foundation.
