Table of Contents
Fetching ...

Categorical Data Structures for Technical Computing

Evan Patterson, Owen Lynch, James Fairbanks

TL;DR

This work introduces acsets, attributed C-sets, as a practical in-memory data structure that unifies graphs and data frames within a rigorous categorical framework. By viewing data as functors from finitely presented categories to Set, acsets extend C-sets with typed attributes and are implemented in Julia via Catlab, enabling automatic code generation and high performance. The authors establish that acsets form a slice category, enabling limits, colimits, and data migration operations, and support structured cospans for open systems. Empirical benchmarks show acsets rival specialized graph libraries while offering broad generality for graph-like and relational objects. The approach promises a versatile, compositional foundation for technical computing with graphs, networks, and relational data, with clear pathways for future extensions to other base categories and schema-driven data structures.

Abstract

Many mathematical objects can be represented as functors from finitely-presented categories $\mathsf{C}$ to $\mathsf{Set}$. For instance, graphs are functors to $\mathsf{Set}$ from the category with two parallel arrows. Such functors are known informally as $\mathsf{C}$-sets. In this paper, we describe and implement an extension of $\mathsf{C}$-sets having data attributes with fixed types, such as graphs with labeled vertices or real-valued edge weights. We call such structures "acsets," short for "attributed $\mathsf{C}$-sets." Derived from previous work on algebraic databases, acsets are a joint generalization of graphs and data frames. They also encompass more elaborate graph-like objects such as wiring diagrams and Petri nets with rate constants. We develop the mathematical theory of acsets and then describe a generic implementation in the Julia programming language, which uses advanced language features to achieve performance comparable with specialized data structures.

Categorical Data Structures for Technical Computing

TL;DR

This work introduces acsets, attributed C-sets, as a practical in-memory data structure that unifies graphs and data frames within a rigorous categorical framework. By viewing data as functors from finitely presented categories to Set, acsets extend C-sets with typed attributes and are implemented in Julia via Catlab, enabling automatic code generation and high performance. The authors establish that acsets form a slice category, enabling limits, colimits, and data migration operations, and support structured cospans for open systems. Empirical benchmarks show acsets rival specialized graph libraries while offering broad generality for graph-like and relational objects. The approach promises a versatile, compositional foundation for technical computing with graphs, networks, and relational data, with clear pathways for future extensions to other base categories and schema-driven data structures.

Abstract

Many mathematical objects can be represented as functors from finitely-presented categories to . For instance, graphs are functors to from the category with two parallel arrows. Such functors are known informally as -sets. In this paper, we describe and implement an extension of -sets having data attributes with fixed types, such as graphs with labeled vertices or real-valued edge weights. We call such structures "acsets," short for "attributed -sets." Derived from previous work on algebraic databases, acsets are a joint generalization of graphs and data frames. They also encompass more elaborate graph-like objects such as wiring diagrams and Petri nets with rate constants. We develop the mathematical theory of acsets and then describe a generic implementation in the Julia programming language, which uses advanced language features to achieve performance comparable with specialized data structures.

Paper Structure

This paper contains 25 sections, 6 theorems, 18 equations, 7 figures, 1 table.

Key Result

Proposition 1

Limits and colimits in $\mathsf{D}^{\mathsf{C}}$ are computed pointwise in $\mathsf{D}$. More precisely, if $J$ is a small category and $\mathsf{D}$ has all (co)limits of shape $J$, then $\mathsf{D}^{\mathsf{C}}$ has all (co)limits of shape $J$. Moreover, the limit of a diagram $K \colon J \xrightar for all $c \in \mathop{\mathrm{ob}}\nolimits \mathsf{C}$, and similarly for colimits.

Figures (7)

  • Figure 1: Port graphs
  • Figure 2: Whole-grained Petri nets
  • Figure 3: Undirected wiring diagrams
  • Figure 4: Categorical products of graphs
  • Figure 5: Schema for decorated graphs
  • ...and 2 more figures

Theorems & Definitions (33)

  • Example 1
  • Example 2
  • Example 3
  • Definition 1
  • Example 4
  • Definition 2
  • Definition 3
  • Definition 4
  • Example 5
  • Example 6
  • ...and 23 more