Table of Contents
Fetching ...

Compiling to recurrent neurons

Joey Velez-Ginorio, Nada Amin, Konrad Kording, Steve Zdancewic

TL;DR

This work shows that discrete programming constructs like conditionals and iteration can be made differentiable by compiling to linear maps and linear recurrent neurons. It introduces a minimal, well-typed linear language ${\textsf{Cajal}}({\multimap}, {\mathbb{2}}, {\mathbb{N}})$, proves its programs compile correctly to recurrent neural dynamics, and demonstrates this with two image-transformation experiments. The results indicate that incorporating first-class discrete structure accelerates learning and improves data efficiency compared to purely continuous counterparts, while also highlighting stability considerations when linking to high-norm components. Overall, the paper advances differentiable programming by formally connecting discrete computation with gradient-based learning through a rigorous compiler framework and empirical validation.

Abstract

Discrete structures are currently second-class in differentiable programming. Since functions over discrete structures lack overt derivatives, differentiable programs do not differentiate through them and limit where they can be used. For example, when programming a neural network, conditionals and iteration cannot be used everywhere; they can break the derivatives necessary for gradient-based learning to work. This limits the class of differentiable algorithms we can directly express, imposing restraints on how we build neural networks and differentiable programs more generally. However, these restraints are not fundamental. Recent work shows conditionals can be first-class, by compiling them into differentiable form as linear neurons. Similarly, this work shows iteration can be first-class -- by compiling to linear recurrent neurons. We present a minimal typed, higher-order and linear programming language with iteration called $\textsf{Cajal}\scriptstyle(\mathbb{\multimap}, \mathbb{2}, \mathbb{N})$. We prove its programs compile correctly to recurrent neurons, allowing discrete algorithms to be expressed in a differentiable form compatible with gradient-based learning. With our implementation, we conduct two experiments where we link these recurrent neurons against a neural network solving an iterative image transformation task. This determines part of its function prior to learning. As a result, the network learns faster and with greater data-efficiency relative to a neural network programmed without first-class iteration. A key lesson is that recurrent neurons enable a rich interplay between learning and the discrete structures of ordinary programming.

Compiling to recurrent neurons

TL;DR

This work shows that discrete programming constructs like conditionals and iteration can be made differentiable by compiling to linear maps and linear recurrent neurons. It introduces a minimal, well-typed linear language , proves its programs compile correctly to recurrent neural dynamics, and demonstrates this with two image-transformation experiments. The results indicate that incorporating first-class discrete structure accelerates learning and improves data efficiency compared to purely continuous counterparts, while also highlighting stability considerations when linking to high-norm components. Overall, the paper advances differentiable programming by formally connecting discrete computation with gradient-based learning through a rigorous compiler framework and empirical validation.

Abstract

Discrete structures are currently second-class in differentiable programming. Since functions over discrete structures lack overt derivatives, differentiable programs do not differentiate through them and limit where they can be used. For example, when programming a neural network, conditionals and iteration cannot be used everywhere; they can break the derivatives necessary for gradient-based learning to work. This limits the class of differentiable algorithms we can directly express, imposing restraints on how we build neural networks and differentiable programs more generally. However, these restraints are not fundamental. Recent work shows conditionals can be first-class, by compiling them into differentiable form as linear neurons. Similarly, this work shows iteration can be first-class -- by compiling to linear recurrent neurons. We present a minimal typed, higher-order and linear programming language with iteration called . We prove its programs compile correctly to recurrent neurons, allowing discrete algorithms to be expressed in a differentiable form compatible with gradient-based learning. With our implementation, we conduct two experiments where we link these recurrent neurons against a neural network solving an iterative image transformation task. This determines part of its function prior to learning. As a result, the network learns faster and with greater data-efficiency relative to a neural network programmed without first-class iteration. A key lesson is that recurrent neurons enable a rich interplay between learning and the discrete structures of ordinary programming.

Paper Structure

This paper contains 61 sections, 48 equations, 29 figures.

Figures (29)

  • Figure 1: Iterative image transform
  • Figure 2: Programming an iterative image transform
  • Figure 3: Compiling iteration into differentiable form
  • Figure 5: Linear recurrent neurons
  • Figure 6: Unfolding linear recurrent neurons for ${\color{bmidnight}n}={\color{bmidnight}2}$
  • ...and 24 more figures