Table of Contents
Fetching ...

Relational Composition in Neural Networks: A Survey and Call to Action

Martin Wattenberg, Fernanda B. Viégas

TL;DR

The paper addresses how neural networks might encode structured relations beyond simple feature sums, formalizing the linear representation hypothesis with $x = \sum_{i=1}^m a_i v_i$ and surveying relational composition mechanisms. It reviews additive matrix binding, multi-token relational embeddings, and vector symbolic architectures, highlighting phenomena such as feature multiplicity and 'dark matter' features that challenge traditional interpretability. The authors propose concrete research directions, including toy models to learn composition, dictionary-learning on synthetic mechanisms, token-difference feature extraction, and empirical measurements in real networks, to determine how relational composition shapes feature discovery and intervention. Overall, the work aims to catalyze empirical validation of relational composition in neural nets and to improve interpretability by uncovering structure-aware, compositional representations.

Abstract

Many neural nets appear to represent data as linear combinations of "feature vectors." Algorithms for discovering these vectors have seen impressive recent success. However, we argue that this success is incomplete without an understanding of relational composition: how (or whether) neural nets combine feature vectors to represent more complicated relationships. To facilitate research in this area, this paper offers a guided tour of various relational mechanisms that have been proposed, along with preliminary analysis of how such mechanisms might affect the search for interpretable features. We end with a series of promising areas for empirical research, which may help determine how neural networks represent structured data.

Relational Composition in Neural Networks: A Survey and Call to Action

TL;DR

The paper addresses how neural networks might encode structured relations beyond simple feature sums, formalizing the linear representation hypothesis with and surveying relational composition mechanisms. It reviews additive matrix binding, multi-token relational embeddings, and vector symbolic architectures, highlighting phenomena such as feature multiplicity and 'dark matter' features that challenge traditional interpretability. The authors propose concrete research directions, including toy models to learn composition, dictionary-learning on synthetic mechanisms, token-difference feature extraction, and empirical measurements in real networks, to determine how relational composition shapes feature discovery and intervention. Overall, the work aims to catalyze empirical validation of relational composition in neural nets and to improve interpretability by uncovering structure-aware, compositional representations.

Abstract

Many neural nets appear to represent data as linear combinations of "feature vectors." Algorithms for discovering these vectors have seen impressive recent success. However, we argue that this success is incomplete without an understanding of relational composition: how (or whether) neural nets combine feature vectors to represent more complicated relationships. To facilitate research in this area, this paper offers a guided tour of various relational mechanisms that have been proposed, along with preliminary analysis of how such mechanisms might affect the search for interpretable features. We end with a series of promising areas for empirical research, which may help determine how neural networks represent structured data.
Paper Structure (18 sections, 27 equations)