The Relational Bottleneck as an Inductive Bias for Efficient Abstraction

Taylor W. Webb; Steven M. Frankland; Awni Altabaa; Simon Segert; Kamesh Krishnamurthy; Declan Campbell; Jacob Russin; Tyler Giallanza; Zack Dulberg; Randall O'Reilly; John Lafferty; Jonathan D. Cohen

The Relational Bottleneck as an Inductive Bias for Efficient Abstraction

Taylor W. Webb, Steven M. Frankland, Awni Altabaa, Simon Segert, Kamesh Krishnamurthy, Declan Campbell, Jacob Russin, Tyler Giallanza, Zack Dulberg, Randall O'Reilly, John Lafferty, Jonathan D. Cohen

TL;DR

A family of models are reviewed that employ an inductive bias to induce abstractions in a data-efficient manner, emphasizing their potential as candidate models for the acquisition of abstract concepts in the human mind and brain.

Abstract

A central challenge for cognitive science is to explain how abstract concepts are acquired from limited experience. This has often been framed in terms of a dichotomy between connectionist and symbolic cognitive models. Here, we highlight a recently emerging line of work that suggests a novel reconciliation of these approaches, by exploiting an inductive bias that we term the relational bottleneck. In that approach, neural networks are constrained via their architecture to focus on relations between perceptual inputs, rather than the attributes of individual inputs. We review a family of models that employ this approach to induce abstractions in a data-efficient manner, emphasizing their potential as candidate models for the acquisition of abstract concepts in the human mind and brain.

The Relational Bottleneck as an Inductive Bias for Efficient Abstraction

TL;DR

Abstract

Paper Structure (29 sections, 3 equations, 3 figures)

This paper contains 29 sections, 3 equations, 3 figures.

Highlights
Modeling the efficient induction of abstractions
The relational bottleneck
The relational bottleneck in neural architectures
The relational bottleneck in the mind and brain
Concluding remarks and future directions
Outstanding Questions
Glossary
Declaration of interests
Acknowledgements

Figures (3)

Figure 1: The relational bottleneck. An inductive bias that prioritizes the representation of relations (e.g., 'same' vs. 'different'), and discourages the representation of the features of individual objects (e.g., the shape or color of the objects in the images above). The result is that downstream processing is driven primarily, or even exclusively by patterns of relations, and can therefore systematically generalize those patterns across distinct instances (e.g., the common ABA pattern displayed on both left and right), even for completely novel objects. The approach is illustrated here with same/different relations, but other relations can also be accommodated. Note that this example is intended only to illustrate the overall goal of the relational bottleneck framework. Figure \ref{['rel_bottleneck_architectures']} depicts neural architectures that implement the approach.
Figure 2: Implementing the relational bottleneck. Three neural architectures that implement the relational bottleneck. (a) Emergent Symbol Binding Network (ESBN) webb2020emergent. (b) Compositional Relation Network (CoRelNet) kerg2022neural. (c) Abstractor altabaa2023abstractors. In all cases, high-dimensional inputs (e.g., images) are processed by a neural encoder (e.g., a convolutional network), yielding a set of object embeddings $\mathbf{O}$. These are projected to a set of keys $\mathbf{K}$ and queries $\mathbf{Q}$, which are then compared yielding a relation matrix $\mathbf{R}$, in which each entry is an inner product between a query and key. Abstract values $\mathbf{V}$ are isolated from perceptual inputs (the core feature of the relational bottleneck), and depend only on the relations between them.
Figure 3: The relational bottleneck encourages data-efficient and generalizable relation learning.(a) Results for the ESBN and baseline architectures (Transformer, Neural Turing Machine (NTM), Metalearned Neural Memory (MNM), Long Short-Term Memory (LSTM), PrediNet, and the Relation Network (RN)) on the identity rules task, reproduced from webb2020emergent. X axis represents the number of potential objects (out of 100 possible objects) withheld during training. When all objects are observed during training (0 withheld), most baselines perform well on the task. When most objects are withheld (95 withheld; test set includes only objects withheld during training), only the ESBN generalizes well to new objects. (b) Results for an object sorting task involving an asymmetric relation (greater-than/less-than), reproduced from altabaa2023abstractors. The abstractor learns this task significantly faster than both the transformer and an ablation model in which relational cross-attention is replaced by standard cross-attention. (c) Results from the give-N task, reproduced from dulberg2021modelling. X axis represents the target number N (desired number of objects). Y axis represents the episode at which the model reaches a particular criterion for the ability to count to each value of N. ESBN learns the task significantly faster than the LSTM or Transformer baselines. ESBN also displays inductive transition (rapid learning for $N>5$) similar to that observed in human development.

The Relational Bottleneck as an Inductive Bias for Efficient Abstraction

TL;DR

Abstract

The Relational Bottleneck as an Inductive Bias for Efficient Abstraction

Authors

TL;DR

Abstract

Table of Contents

Figures (3)