Table of Contents
Fetching ...

Aggregating time-series and image data: functors and double functors

Joscha Diehl

TL;DR

The paper develops a unified, category-theoretic view of data aggregation for time-series and images by modeling aggregations as functors on a category of intervals and extending to double functors on a double category of rectangles. It leverages the freeness of these categories to guarantee universal constructions and to enable unique liftings of local data to global aggregates, while showing that parallelizable implementations follow from Blelloch’s prefix-scan. Key contributions include formalizing 1D aggregations as functors with a wide range of targets, introducing the free double category of rectangles for 2D data, and detailing row-wise then column-wise parallel scans for 2D data with explicit complexity. The work provides a principled, scalable framework for implementing both time-series and image data aggregations in parallel, and points to rich future directions involving irregular geometries and non-abelian categorical structures to capture more complex data domains.

Abstract

Aggregation of time-series or image data over subsets of the domain is a fundamental task in data science. We show that many known aggregation operations can be interpreted as (double) functors on appropriate (double) categories. Such functorial aggregations are amenable to parallel implementation via straightforward extensions of Blelloch's parallel scan algorithm. In addition to providing a unified viewpoint on existing operations, it allows us to propose new aggregation operations for time-series and image data.

Aggregating time-series and image data: functors and double functors

TL;DR

The paper develops a unified, category-theoretic view of data aggregation for time-series and images by modeling aggregations as functors on a category of intervals and extending to double functors on a double category of rectangles. It leverages the freeness of these categories to guarantee universal constructions and to enable unique liftings of local data to global aggregates, while showing that parallelizable implementations follow from Blelloch’s prefix-scan. Key contributions include formalizing 1D aggregations as functors with a wide range of targets, introducing the free double category of rectangles for 2D data, and detailing row-wise then column-wise parallel scans for 2D data with explicit complexity. The work provides a principled, scalable framework for implementing both time-series and image data aggregations in parallel, and points to rich future directions involving irregular geometries and non-abelian categorical structures to capture more complex data domains.

Abstract

Aggregation of time-series or image data over subsets of the domain is a fundamental task in data science. We show that many known aggregation operations can be interpreted as (double) functors on appropriate (double) categories. Such functorial aggregations are amenable to parallel implementation via straightforward extensions of Blelloch's parallel scan algorithm. In addition to providing a unified viewpoint on existing operations, it allows us to propose new aggregation operations for time-series and image data.

Paper Structure

This paper contains 11 sections, 2 theorems, 61 equations, 11 figures.

Key Result

Theorem 2.6

Let be a quiver. Then there exists a category $\mathsf{Free}(Q)$ and a map of quivers $\iota: Q \to \mathsf{Forget}( \mathsf{Free}(Q) )$ which is free on $Q$ in the following sense: for every category $\mathcal{D}$ and every map $f: Q \to \mathsf{Forget}( \mathcal{D})$ of quivers, there exists a unique Moreover, $\mathsf{Free}: \underline{\mathbf{Quiver}} \to \underline{\mathbf{Cat}}$ is a funct

Figures (11)

  • Figure 1: Interpolation of a time-series $x_0, x_1, \ldots$ to a continuous curve. The curve is piecewise affine between the values of $x$.
  • Figure 2: Aggregation of a time-series $x_0, x_1, \ldots, x_{n-1}$ to $a_0 \bullet a_1 \bullet a_2 \bullet a_3$. The dotted arrows indicate the mapping from the time-series to the elements of the semigroup.
  • Figure 3: Visualization of a category $\mathcal{C}$, showing objects (points), morphisms (arrows), identity morphisms (blue loops), the source and target maps as well as composition.
  • Figure 6: A functor $F$ between two categories $\mathcal{C}$ and $\mathcal{D}$. The functor preserves the structure of the categories, including composition of morphisms. For legibility, the identity morphisms are not shown.
  • Figure 7: Aggregation in a category
  • ...and 6 more figures

Theorems & Definitions (21)

  • Example 2.1
  • Remark 2.2
  • Example 2.3
  • Example 2.5
  • Theorem 2.6: Free category over a quiver
  • Example 2.7
  • proof
  • Example 2.8
  • Example 3.1: Rectangles; SoncZucc15, fiore2008model.
  • Example 3.2
  • ...and 11 more