Table of Contents
Fetching ...

A Pattern Language for Machine Learning Tasks

Benjamin Rodatz, Ian Fan, Tuomas Laakkonen, Neil John Ortega, Thomas Hoffmann, Vincent Wang-Mascianica

TL;DR

The paper introduces a diagrammatic, task-based language for ML in which objectives are encoded as equational constraints among learners. It formalises atomic and compound tasks, defines objective functions via divergences and a differentiable combination, and shows how standard ML paradigms instantiate patterns that can be reasoned about compositionally. It then introduces a novel manipulation task that edits a target attribute while preserving other properties, and proves connections to Bayesian inversion and CycleGAN through refinements, showing how such tasks can yield architecture-agnostic, training-stable models without adversarial training per se. Empirically, it validates manipulation on Spriteworld, MNIST, and CelebA, demonstrating end-to-end, pattern-driven design with interpretable latent-space effects.

Abstract

We formalise the essential data of objective functions as equality constraints on composites of learners. We call these constraints "tasks", and we investigate the idealised view that such tasks determine model behaviours. We develop a flowchart-like graphical mathematics for tasks that allows us to; (1) offer a unified perspective of approaches in machine learning across domains; (2) design and optimise desired behaviours model-agnostically; and (3) import insights from theoretical computer science into practical machine learning. As a proof-of-concept of the potential practical impact of our theoretical framework, we exhibit and implement a novel "manipulator" task that minimally edits input data to have a desired attribute. Our model-agnostic approach achieves this end-to-end, and without the need for custom architectures, adversarial training, random sampling, or interventions on the data, hence enabling capable, small-scale, and training-stable models.

A Pattern Language for Machine Learning Tasks

TL;DR

The paper introduces a diagrammatic, task-based language for ML in which objectives are encoded as equational constraints among learners. It formalises atomic and compound tasks, defines objective functions via divergences and a differentiable combination, and shows how standard ML paradigms instantiate patterns that can be reasoned about compositionally. It then introduces a novel manipulation task that edits a target attribute while preserving other properties, and proves connections to Bayesian inversion and CycleGAN through refinements, showing how such tasks can yield architecture-agnostic, training-stable models without adversarial training per se. Empirically, it validates manipulation on Spriteworld, MNIST, and CelebA, demonstrating end-to-end, pattern-driven design with interpretable latent-space effects.

Abstract

We formalise the essential data of objective functions as equality constraints on composites of learners. We call these constraints "tasks", and we investigate the idealised view that such tasks determine model behaviours. We develop a flowchart-like graphical mathematics for tasks that allows us to; (1) offer a unified perspective of approaches in machine learning across domains; (2) design and optimise desired behaviours model-agnostically; and (3) import insights from theoretical computer science into practical machine learning. As a proof-of-concept of the potential practical impact of our theoretical framework, we exhibit and implement a novel "manipulator" task that minimally edits input data to have a desired attribute. Our model-agnostic approach achieves this end-to-end, and without the need for custom architectures, adversarial training, random sampling, or interventions on the data, hence enabling capable, small-scale, and training-stable models.
Paper Structure (30 sections, 6 theorems, 15 equations, 8 figures, 1 table)

This paper contains 30 sections, 6 theorems, 15 equations, 8 figures, 1 table.

Key Result

Lemma 2.9

For all well-typed $f$, $g$, and for any positive linear combination $\alpha: \textcolor{cbgreen}{\mathbb{R}^{\geq 0}} \times \textcolor{cbgreen}{\mathbb{R}^{\geq 0}} \rightarrow \textcolor{cbgreen}{\mathbb{R}^{\geq 0}}$:

Figures (8)

  • Figure 1: In this example, we train a stack (alongside an autoencoder) to store the latent vectors of Spriteworld shapes. With an image latent size 16 and stack vector size 64, it is able to retain information to faithfully restore up to 4 shapes.
  • Figure 2: An input Spriteworld image alongside a spectrum of outputs exhibiting the ability of the put to manipulate a single attribute of the input while preserving its other properties. Additionally, the model is able to generalise by interpolating to attribute values unseen during training, in this case producing orange and cyan shapes, whereas during training, it only sees red, green or blue shapes. (further details in \ref{['appendix:spriteworld']})
  • Figure 3: Outputs of a put trained against an MNIST classifier. The put preserves several graphological aspects, such as stroke weight, slant, and angularity. This represents qualitative evidence to support our prediction that put as a class-conditioned generative model behaves as a style-preserving edit.
  • Figure 4: To illustrate the concepts of derived attributes and unequal entropy, consider an attribute on the Spriteworld data called blue-circleness, which broadly measures how similar a shape is to a blue circle. We define blue-circleness (bc) as a function of explicit attributes shape and colour; we assign a continuous colour score $cs \in [0, 1]$ based on the hue, where red $= 0$ and blue $= 1$. To illustrate unequal entropy in this example, the class $0$ has higher entropy than $0.4$ because there are more shapes that have bc-value $0$. So manipulating a shape with bc-value $0$ to $0.4$ must lose information.
  • Figure 5: Complement manipulators (\ref{['patt:compManipulator']}) can manipulate derived attributes such as blue-circleness, by using the complement as a scratchpad to record a correspondence between data points (further details in \ref{['appendix:spriteworld']}) while preserving attributes such as position and size.
  • ...and 3 more figures

Theorems & Definitions (40)

  • Example 1.1
  • Example 1.2: Residuation as an architectural choice
  • Example 1.3: Perceptual losses as multi-objective learning
  • Example 1.4: VAE
  • Definition 2.1: Tasks
  • Definition 2.2: Objective function
  • Example 2.6: CycleGAN
  • Definition 2.8: Refinement and equivalence of tasks
  • Lemma 2.9
  • proof
  • ...and 30 more