Table of Contents
Fetching ...

Survival of the Fittest Representation: A Case Study with Modular Addition

Xiaoman Delores Ding, Zifan Carl Guo, Eric J. Michaud, Ziming Liu, Max Tegmark

TL;DR

The paper addresses how neural networks choose among competing representations by proposing a Survival of the Fittest framework and studying modular addition as a tractable testbed. It shows that representations in the Fourier basis form multiple circulating frequencies, of which only a few survive, with survival correlating to high initial signal and gradient under resource constraints set by embedding dimensionality. The dynamics of surviving circles are well captured by a linear differential equation, enabling a clear decomposition of complex representations into interacting components and highlighting cooperative interactions among circles. The findings link to broader theoretical ideas such as the Neural Tangent Kernel and Lottery Ticket Hypothesis, offering a principled lens to analyze and potentially control representation formation in neural networks. Overall, the work provides a minimal, interpretable setup where training dynamics reduce to simple, predictive laws that illuminate how representations emerge and persist.

Abstract

When a neural network can learn multiple distinct algorithms to solve a task, how does it "choose" between them during training? To approach this question, we take inspiration from ecology: when multiple species coexist, they eventually reach an equilibrium where some survive while others die out. Analogously, we suggest that a neural network at initialization contains many solutions (representations and algorithms), which compete with each other under pressure from resource constraints, with the "fittest" ultimately prevailing. To investigate this Survival of the Fittest hypothesis, we conduct a case study on neural networks performing modular addition, and find that these networks' multiple circular representations at different Fourier frequencies undergo such competitive dynamics, with only a few circles surviving at the end. We find that the frequencies with high initial signals and gradients, the "fittest," are more likely to survive. By increasing the embedding dimension, we also observe more surviving frequencies. Inspired by the Lotka-Volterra equations describing the dynamics between species, we find that the dynamics of the circles can be nicely characterized by a set of linear differential equations. Our results with modular addition show that it is possible to decompose complicated representations into simpler components, along with their basic interactions, to offer insight on the training dynamics of representations.

Survival of the Fittest Representation: A Case Study with Modular Addition

TL;DR

The paper addresses how neural networks choose among competing representations by proposing a Survival of the Fittest framework and studying modular addition as a tractable testbed. It shows that representations in the Fourier basis form multiple circulating frequencies, of which only a few survive, with survival correlating to high initial signal and gradient under resource constraints set by embedding dimensionality. The dynamics of surviving circles are well captured by a linear differential equation, enabling a clear decomposition of complex representations into interacting components and highlighting cooperative interactions among circles. The findings link to broader theoretical ideas such as the Neural Tangent Kernel and Lottery Ticket Hypothesis, offering a principled lens to analyze and potentially control representation formation in neural networks. Overall, the work provides a minimal, interpretable setup where training dynamics reduce to simple, predictive laws that illuminate how representations emerge and persist.

Abstract

When a neural network can learn multiple distinct algorithms to solve a task, how does it "choose" between them during training? To approach this question, we take inspiration from ecology: when multiple species coexist, they eventually reach an equilibrium where some survive while others die out. Analogously, we suggest that a neural network at initialization contains many solutions (representations and algorithms), which compete with each other under pressure from resource constraints, with the "fittest" ultimately prevailing. To investigate this Survival of the Fittest hypothesis, we conduct a case study on neural networks performing modular addition, and find that these networks' multiple circular representations at different Fourier frequencies undergo such competitive dynamics, with only a few circles surviving at the end. We find that the frequencies with high initial signals and gradients, the "fittest," are more likely to survive. By increasing the embedding dimension, we also observe more surviving frequencies. Inspired by the Lotka-Volterra equations describing the dynamics between species, we find that the dynamics of the circles can be nicely characterized by a set of linear differential equations. Our results with modular addition show that it is possible to decompose complicated representations into simpler components, along with their basic interactions, to offer insight on the training dynamics of representations.
Paper Structure (34 sections, 5 equations, 16 figures)

This paper contains 34 sections, 5 equations, 16 figures.

Figures (16)

  • Figure 1: Our experiment method: perform a Fourier transform to the embedding, analyze the initialization, signal evolution, and their effects on the final learned circles over different training runs.
  • Figure 2: (Left) The signals, the magnitude of the Fourier coefficients, of each embedding frequencies over time shown on a logarithmic scale. The surviving frequencies clearly separate themselves from the rest of the frequencies that quickly go to 0. (Right) Snapshots of the embedding projected onto a dead frequency (top) and a survived frequency (bottom) at different timesteps during training.
  • Figure 3: Top: The model embedding projected onto the first 10 principal components in pairs, the only components with significant singular values. Bottom: The model embedding projected onto the Fourier basis of frequencies in descending order of signal magnitude. The $\Delta$ between adjacent tokens shows a correspondence between PCA and FFT.This indicates that PCA is a loose approximation of circles associated with Fourier frequencies.
  • Figure 4: Freezing the initial embedding and training only the MLP, test loss (zoomed in on the bottom to $< 1.0$) in relation to embedding dimension $d$. One can notice a dip in loss at $d=p$.
  • Figure 5: (Left): The number of circles the model chooses for its final representation in relation to the number of tokens, $p$, over 100 random trials. (Right): The number of circles as the embedding dimension for representing each token increases from 16 to 128.
  • ...and 11 more figures