DiagrammaticLearning: A Graphical Language for Compositional Training Regimes

Mason Lary; Richard Samuelson; Alexander Wilentz; Alina Zare; Matthew Klawonn; James P. Fairbanks

DiagrammaticLearning: A Graphical Language for Compositional Training Regimes

Mason Lary, Richard Samuelson, Alexander Wilentz, Alina Zare, Matthew Klawonn, James P. Fairbanks

TL;DR

Learning diagrams provide a graphical, category-theoretic language for composing multi-component training regimes, treating data, models, and their interactions as structured data and compiling them into a single composite loss. The authors formalize learning graphs and diagrams, implement DiagrammaticLearning.jl, and demonstrate NIC, knowledge distillation, and few-shot learning within this framework, including colimit-based composition for shared backbones. The approach yields rigorous semantics, a path-finding and loss-compilation pipeline, and library support that integrates with PyTorch and Flux, enabling modular, reusable, and scalable training workflows. This work offers a principled foundation for reproducible, compositional ML pipelines and points toward broader tools for model management and AutoML under a mathematics-based paradigm.

Abstract

Motivated by deep learning regimes with multiple interacting yet distinct model components, we introduce learning diagrams, graphical depictions of training setups that capture parameterized learning as data rather than code. A learning diagram compiles to a unique loss function on which component models are trained. The result of training on this loss is a collection of models whose predictions ``agree" with one another. We show that a number of popular learning setups such as few-shot multi-task learning, knowledge distillation, and multi-modal learning can be depicted as learning diagrams. We further implement learning diagrams in a library that allows users to build diagrams of PyTorch and Flux.jl models. By implementing some classic machine learning use cases, we demonstrate how learning diagrams allow practitioners to build complicated models as compositions of smaller components, identify relationships between workflows, and manipulate models during or after training. Leveraging a category theoretic framework, we introduce a rigorous semantics for learning diagrams that puts such operations on a firm mathematical foundation.

DiagrammaticLearning: A Graphical Language for Compositional Training Regimes

TL;DR

Abstract

Paper Structure (18 sections, 1 theorem, 10 equations, 2 figures, 3 tables)

This paper contains 18 sections, 1 theorem, 10 equations, 2 figures, 3 tables.

Introduction
Learning Diagram Demonstrations
Learning Diagram Notation
Image Captioning
Knowledge Distillation
Few Shot Learning
Implementation
Specifying the diagram data
Finding Parallel Paths
Integration with Existing Libraries
Theoretical Grounding
Learning Graphs and Learning Diagrams
Finding Paths and Building Losses
Limiting the Search
Ordering paths
...and 3 more sections

Key Result

Theorem 4.3

The mapping $P \mapsto \ell(f, g)$ defines a functor $\ell: \mathsf{C} \to \mathsf{Cost}$.

Figures (2)

Figure 1: A diagrammatic representation of Knowledge Distillation (left), the red and blue parallel paths encode equations that the optimization procedure attempts to solve (right). Components models used in both pairs of parallel paths are shown in violet.
Figure 2: Learning diagrams can be built using the categorical construction of a colimit, which generalizes the concept of a set union or quotienting by an equivalence relation. (Top) A data set and classifier/encoder pair are composed by identifying the common image and label domains $X$ and $Y$ (dashed arrows). The result of the colimit computation is a square whose induced loss trains the classifier and encoder on the labeling task of the associated data set. (Bottom) Two classifier squares are merged along a common feature extraction backbones $m$. The resulting colimit is a learning diagram whose loss trains classifier heads on their respective tasks and the common backbone model on all tasks. For this colimit to be well specified, both the image space $X$ and the feature space $F$ have to be identical. Additional classification heads can be attached by further colimits.

Theorems & Definitions (4)

Definition 4.1
Definition 4.2
Theorem 4.3
proof

DiagrammaticLearning: A Graphical Language for Compositional Training Regimes

TL;DR

Abstract

DiagrammaticLearning: A Graphical Language for Compositional Training Regimes

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (4)