Table of Contents
Fetching ...

Category Theory for Supercomputing: The Tensor Product of Linear BSP Algorithms

Thomas Koopman, Rob H. Bisseling, Sven-Bodo Scholz

TL;DR

This work addresses how to construct parallel algorithms for tensor products of linear functions within the Bulk Synchronous Parallel (BSP) model by introducing linear BSP algorithms and a tensor-product recipe. The authors establish that, using distributions, computation, and communication steps, one can derive a BSP algorithm for $f_1 \otimes \cdots \otimes f_d$ from linear BSP algorithms for each $f_i$, relying on the distributivity of $\otimes$ over $\oplus$ and the functorial nature of the construction. They apply the framework to the Discrete Fourier Transform (DFT) and the Discrete Cosine Transform (DCT-II), deriving higher-dimensional DFTs and parallel DCTs and detailing linear BSP decompositions and tensor-product extensions. The significance lies in providing a compositional, category-theoretic perspective that yields scalable, structured parallel transforms and suggests broader applicability to HPC algorithms and potentially quantum-inspired constructions.

Abstract

We show that a particular class of parallel algorithm for linear functions can be straightforwardly generalized to a parallel algorithm of their tensor product. The central idea is to take a model of parallel algorithms -- Bulk Synchronous Parallel (BSP) -- that decomposes parallel algorithms into so-called supersteps that are one of two types: a computation superstep that only does local computations, or a communication superstep that only communicates between processors. We connect each type of supersteps to linear algebra with functors. Each superstep in isolation is simple enough to compute their tensor product in Vect with well-known techniques of linear algebra. We then individually translate the tensor product of supersteps back to the language of BSP algorithms. The functoriality of the tensor product allows us to compose the supersteps back into a BSP algorithm for the tensor product of the original function. We state the recipe for creating these new algorithms with only a minimal amount of algebra, so that it can be applied without understanding the category-theoretic details. We have previously used this to derive an efficient algorithm for the higher-dimensional Discrete Fourier Transform, which we use as an example throughout this paper. We also derive a parallel algorithm for the Discrete Cosine Transform to demonstrate the generality of our approach.

Category Theory for Supercomputing: The Tensor Product of Linear BSP Algorithms

TL;DR

This work addresses how to construct parallel algorithms for tensor products of linear functions within the Bulk Synchronous Parallel (BSP) model by introducing linear BSP algorithms and a tensor-product recipe. The authors establish that, using distributions, computation, and communication steps, one can derive a BSP algorithm for from linear BSP algorithms for each , relying on the distributivity of over and the functorial nature of the construction. They apply the framework to the Discrete Fourier Transform (DFT) and the Discrete Cosine Transform (DCT-II), deriving higher-dimensional DFTs and parallel DCTs and detailing linear BSP decompositions and tensor-product extensions. The significance lies in providing a compositional, category-theoretic perspective that yields scalable, structured parallel transforms and suggests broader applicability to HPC algorithms and potentially quantum-inspired constructions.

Abstract

We show that a particular class of parallel algorithm for linear functions can be straightforwardly generalized to a parallel algorithm of their tensor product. The central idea is to take a model of parallel algorithms -- Bulk Synchronous Parallel (BSP) -- that decomposes parallel algorithms into so-called supersteps that are one of two types: a computation superstep that only does local computations, or a communication superstep that only communicates between processors. We connect each type of supersteps to linear algebra with functors. Each superstep in isolation is simple enough to compute their tensor product in Vect with well-known techniques of linear algebra. We then individually translate the tensor product of supersteps back to the language of BSP algorithms. The functoriality of the tensor product allows us to compose the supersteps back into a BSP algorithm for the tensor product of the original function. We state the recipe for creating these new algorithms with only a minimal amount of algebra, so that it can be applied without understanding the category-theoretic details. We have previously used this to derive an efficient algorithm for the higher-dimensional Discrete Fourier Transform, which we use as an example throughout this paper. We also derive a parallel algorithm for the Discrete Cosine Transform to demonstrate the generality of our approach.

Paper Structure

This paper contains 23 sections, 1 theorem, 13 equations, 8 figures, 4 algorithms.

Key Result

Theorem 1

Linear BSP algorithms $\mathcal{A}_1, \cdots \mathcal{A}_d$ for $f_1, \cdots, f_d$ with the same number and structure of supersteps can be combined into a linear BSP algorithm for $f_1 \otimes \cdots \otimes f_d$ by combining the supersteps and distributions as follows.

Figures (8)

  • Figure 1.1: Initialisation of $x_j = f(j)$ sequentially, and in parallel.
  • Figure 1.2: Graphical representation of cyclically distributed initialisation over three processors, indicated in red, yellow, and green.
  • Figure 3.1: Distributions correspond to the direct sum in Vect. Local view at the top, global view at the bottom.
  • Figure 3.2: A BSP computation superstep of linear functions
  • Figure 3.3: A BSP communication superstep that redistributes data is a bijection on the index set under the free functor
  • ...and 3 more figures

Theorems & Definitions (2)

  • Definition 1: Linear BSP algorithm
  • Theorem 1: The tensor product of linear BSP algorithms