Neural Circuit Diagrams: Robust Diagrams for the Communication, Implementation, and Analysis of Deep Learning Architectures

Vincent Abbott

Neural Circuit Diagrams: Robust Diagrams for the Communication, Implementation, and Analysis of Deep Learning Architectures

Vincent Abbott

TL;DR

The paper addresses the lack of a standard, precise diagrammatic tool for deep learning architectures, which hampers communication, replication, and theoretical analysis. It proposes Neural Circuit Diagrams, a category-theory-inspired graphical language that blends Cartesian data layout with tensor-axis detail to capture data flow, broadcasting, and parallelism, bridging diagrams with implementation. The framework is demonstrated across a suite of architectures (MLP, Transformer, convolutional nets, ResNet, U-Net, and Vision Transformer) and is complemented by a Jupyter notebook to show implementation correspondence and a formal treatment of backpropagation and complexity. This work promises clearer architectural communication, easier cross-framework replication, and rigorous analysis of time/space complexity, potentially accelerating design, verification, and ethical assessment of deep learning systems.

Abstract

Diagrams matter. Unfortunately, the deep learning community has no standard method for diagramming architectures. The current combination of linear algebra notation and ad-hoc diagrams fails to offer the necessary precision to understand architectures in all their detail. However, this detail is critical for faithful implementation, mathematical analysis, further innovation, and ethical assurances. I present neural circuit diagrams, a graphical language tailored to the needs of communicating deep learning architectures. Neural circuit diagrams naturally keep track of the changing arrangement of data, precisely show how operations are broadcast over axes, and display the critical parallel behavior of linear operations. A lingering issue with existing diagramming methods is the inability to simultaneously express the detail of axes and the free arrangement of data, which neural circuit diagrams solve. Their compositional structure is analogous to code, creating a close correspondence between diagrams and implementation. In this work, I introduce neural circuit diagrams for an audience of machine learning researchers. After introducing neural circuit diagrams, I cover a host of architectures to show their utility and breed familiarity. This includes the transformer architecture, convolution (and its difficult-to-explain extensions), residual networks, the U-Net, and the vision transformer. I include a Jupyter notebook that provides evidence for the close correspondence between diagrams and code. Finally, I examine backpropagation using neural circuit diagrams. I show their utility in providing mathematical insight and analyzing algorithms' time and space complexities.

Neural Circuit Diagrams: Robust Diagrams for the Communication, Implementation, and Analysis of Deep Learning Architectures

TL;DR

Abstract

Paper Structure (27 sections, 7 equations, 43 figures)

This paper contains 27 sections, 7 equations, 43 figures.

Introduction
Necessity of Improved Communication in Deep Learning
Case Study: Shortfalls of Attention is All You Need
Current Approaches and Related Works
The Philosophy of My Approach
Contributions
Reading Neural Circuit Diagrams
Commutative Diagrams
Tuples and Memory
String Diagrams
Tensors
Indexes
Broadcasting
Linearity
Multilinearity
...and 12 more sections

Figures (43)

Figure 1: My annotations of the diagrams of the original transformer model. Critical information is missing regarding the origin of $Q$, $K$, and $V$ values (red and blue), and the axes over which operations act (green).
Figure 2: We have two functions: ${ f:\text{str}\rightarrow \text{int}}$ and ${ g:\text{int}\rightarrow \text{float}}$. These functions can be composed into a single function ${ ( f;g) :\text{str}\rightarrow \text{float}}$. In commuting diagrams, we represent data types, such as ${ \text{str}}$, ${ \text{int}}$, and ${ \text{float}}$, with floating symbols, while functions are denoted by arrows connecting them.
Figure 3: Here, I diagram two functions, $f:B\times C\rightarrow D$ and $g:A\times D\rightarrow E$, acting together. To represent the full memory states, we are required to amend $f$ into $\text{Id[} A\text{]} \times f:A\times ( B\times C)\rightarrow A\times D$. The composed function is $(\text{Id[} A\text{]} \times f);g:A\times(B\times C)\rightarrow E$.
Figure 4: We reorient diagrams to go left to right. Wires represent data types, and symbols represent functions. This expression defines $h$.
Figure 5: Tupled data types are diagrammed with wires separated by dashed lines. This clearly shows when functions act on only some variables.
...and 38 more figures

Neural Circuit Diagrams: Robust Diagrams for the Communication, Implementation, and Analysis of Deep Learning Architectures

TL;DR

Abstract

Neural Circuit Diagrams: Robust Diagrams for the Communication, Implementation, and Analysis of Deep Learning Architectures

Authors

TL;DR

Abstract

Table of Contents

Figures (43)