Relational Composition in Neural Networks: A Survey and Call to Action

Martin Wattenberg; Fernanda B. Viégas

Relational Composition in Neural Networks: A Survey and Call to Action

Martin Wattenberg, Fernanda B. Viégas

TL;DR

The paper addresses how neural networks might encode structured relations beyond simple feature sums, formalizing the linear representation hypothesis with $x = \sum_{i=1}^m a_i v_i$ and surveying relational composition mechanisms. It reviews additive matrix binding, multi-token relational embeddings, and vector symbolic architectures, highlighting phenomena such as feature multiplicity and 'dark matter' features that challenge traditional interpretability. The authors propose concrete research directions, including toy models to learn composition, dictionary-learning on synthetic mechanisms, token-difference feature extraction, and empirical measurements in real networks, to determine how relational composition shapes feature discovery and intervention. Overall, the work aims to catalyze empirical validation of relational composition in neural nets and to improve interpretability by uncovering structure-aware, compositional representations.

Abstract

Many neural nets appear to represent data as linear combinations of "feature vectors." Algorithms for discovering these vectors have seen impressive recent success. However, we argue that this success is incomplete without an understanding of relational composition: how (or whether) neural nets combine feature vectors to represent more complicated relationships. To facilitate research in this area, this paper offers a guided tour of various relational mechanisms that have been proposed, along with preliminary analysis of how such mechanisms might affect the search for interpretable features. We end with a series of promising areas for empirical research, which may help determine how neural networks represent structured data.

Relational Composition in Neural Networks: A Survey and Call to Action

TL;DR

The paper addresses how neural networks might encode structured relations beyond simple feature sums, formalizing the linear representation hypothesis with

and surveying relational composition mechanisms. It reviews additive matrix binding, multi-token relational embeddings, and vector symbolic architectures, highlighting phenomena such as feature multiplicity and 'dark matter' features that challenge traditional interpretability. The authors propose concrete research directions, including toy models to learn composition, dictionary-learning on synthetic mechanisms, token-difference feature extraction, and empirical measurements in real networks, to determine how relational composition shapes feature discovery and intervention. Overall, the work aims to catalyze empirical validation of relational composition in neural nets and to improve interpretability by uncovering structure-aware, compositional representations.

Abstract

Paper Structure (18 sections, 27 equations)

This paper contains 18 sections, 27 equations.

Introduction
Definitions, notation, and background
Additive matrix binding
Matrix binding as writing to "slots" in superposition
Tree representations using matrices
Feature multiplicity in additive models
Feature multiplicity and predict/control discrepancies
Multi-token mechanisms
Syntactic relations and tree embeddings
Reference mechanisms: pointers and identifiers
Vector symbolic architecture
Tensor constructions
Binary vectors
Is VSA a plausible mechanism for real neural nets?
Implications for the search for features
...and 3 more sections

Relational Composition in Neural Networks: A Survey and Call to Action

TL;DR

Abstract

Relational Composition in Neural Networks: A Survey and Call to Action

Authors

TL;DR

Abstract

Table of Contents