Functorial Neural Architectures from Higher Inductive Types

Karen Sargsyan

Functorial Neural Architectures from Higher Inductive Types

Karen Sargsyan

Abstract

Neural networks systematically fail at compositional generalization -- producing correct outputs for novel combinations of known parts. We show that this failure is architectural: compositional generalization is equivalent to functoriality of the decoder, and this perspective yields both guarantees and impossibility results. We compile Higher Inductive Type (HIT) specifications into neural architectures via a monoidal functor from the path groupoid of a target space to a category of parametric maps: path constructors become generator networks, composition becomes structural concatenation, and 2-cells witnessing group relations become learned natural transformations. We prove that decoders assembled by structural concatenation of independently generated segments are strict monoidal functors (compositional by construction), while softmax self-attention is not functorial for any non-trivial compositional task. Both results are formalized in Cubical Agda. Experiments on three spaces validate the full hierarchy: on the torus ($\mathbb{Z}^2$), functorial decoders outperform non-functorial ones by 2-2.7x; on $S^1 \vee S^1$ ($F_2$), the type-A/B gap widens to 5.5-10x; on the Klein bottle ($\mathbb{Z} \rtimes \mathbb{Z}$), a learned 2-cell closes a 46% error gap on words exercising the group relation.

Functorial Neural Architectures from Higher Inductive Types

Abstract

), functorial decoders outperform non-functorial ones by 2-2.7x; on

(

), the type-A/B gap widens to 5.5-10x; on the Klein bottle (

), a learned 2-cell closes a 46% error gap on words exercising the group relation.

Paper Structure (48 sections, 9 theorems, 4 equations, 10 tables)

This paper contains 48 sections, 9 theorems, 4 equations, 10 tables.

Introduction
Contributions.
From Types to Categories
Spaces from generators and relations
Composition as monoidal structure
Neural networks as parametric maps
From Specifications to Architectures
What Attention Cannot Compose
Scaling prediction.
Depth obstruction for nonsolvable groups.
Relationship to DisCoCat and categorical deep learning.
Three Spaces, Three Predictions
Task.
Architectures.
Metrics.
...and 33 more sections

Key Result

Theorem 3.3

Let $\mathcal{D}$ be the transport decoder from Construction con:compilation(i--ii).

Theorems & Definitions (28)

Definition 3.1: Parametric loops
Theorem 3.3: Transport composition = strict functoriality
proof
Remark 3.4: Strict vs. up-to-reparametrization
Remark 3.5: When the 2-cell matters
Definition 3.6: Type-A / Type-B
Theorem 4.1: Attention is not functorial
proof : Proof sketch (full proof in Appendix \ref{['app:attention-proof']})
Definition 5.1: Per-segment Chamfer distance
proof
...and 18 more

Functorial Neural Architectures from Higher Inductive Types

Abstract

Functorial Neural Architectures from Higher Inductive Types

Authors

Abstract

Table of Contents

Key Result

Theorems & Definitions (28)