A Relational Inductive Bias for Dimensional Abstraction in Neural Networks

Declan Campbell; Jonathan D. Cohen

A Relational Inductive Bias for Dimensional Abstraction in Neural Networks

Declan Campbell, Jonathan D. Cohen

TL;DR

The paper tackles the gap between neural networks and human abstract reasoning by introducing a relational bottleneck that computes task relations between inputs. This bias encourages factorized, low-dimensional representations, improving sample efficiency and generalization, and yielding human-like biases in geometric regularity tasks without pre-specified symbolic primitives. Empirical results show faster learning, orthogonal embeddings, and alignment with human error patterns, suggesting that simple relational processing can approximate symbolic-like processing. The approach offers neuroscience-relevant insights and practical benefits for building flexible, data-efficient models capable of domain-general relational reasoning.

Abstract

The human cognitive system exhibits remarkable flexibility and generalization capabilities, partly due to its ability to form low-dimensional, compositional representations of the environment. In contrast, standard neural network architectures often struggle with abstract reasoning tasks, overfitting, and requiring extensive data for training. This paper investigates the impact of the relational bottleneck -- a mechanism that focuses processing on relations among inputs -- on the learning of factorized representations conducive to compositional coding and the attendant flexibility of processing. We demonstrate that such a bottleneck not only improves generalization and learning efficiency, but also aligns network performance with human-like behavioral biases. Networks trained with the relational bottleneck developed orthogonal representations of feature dimensions latent in the dataset, reflecting the factorized structure thought to underlie human cognitive flexibility. Moreover, the relational network mimics human biases towards regularity without pre-specified symbolic primitives, suggesting that the bottleneck fosters the emergence of abstract representations that confer flexibility akin to symbols.

A Relational Inductive Bias for Dimensional Abstraction in Neural Networks

TL;DR

Abstract

Paper Structure (12 sections, 5 figures)

This paper contains 12 sections, 5 figures.

Introduction
Relational Bottleneck and Dimensional Representations
Methods
Networks
Task and training data
Results
Dimensional Representations Align with Human Behavior
Methods
Results
Discussion
Conclusion
Supplementary Information

Figures (5)

Figure 1: Network Architecture Feedforward networks used to perform identity and similarity judgement tasks over two input stimuli (see text).
Figure 2: (a) Learning curves for training loss and generalization performance indicate that the relational architecture learns more rapidly than the feedforward architecture. (b) 2-dimension PCA of network embeddings learned by each network. Note that relational network learns orthogonal representations for each dimension, whereas the feed-forward network learns a non-linear manifold.
Figure 3: Oddball construction & trial structure Stimuli consisted of quadrilateral forms varying in their regularity/symmetry. Each trial was comprised of five variants of the same stimulus varying only in size and rotation, and one "oddball" stimulus constructed by perturbing the bottom right vertex of the reference stimulus to violate its regularity. Participants and networks were evaluated on their accuracy in identifying the oddballs.
Figure 4: Relational Network and SimCLR performance on oddball task:(a) Error rates for the relational network and SimCLR at representative points during training. Note that the relational network's error rates exhibit a positive slope as a function of decreasing geometric regularity, consistent with human performance on this task, while the SimCLR network displays no sensitivity to geometric regularity. (b) Correlation coefficients between the model error rates and human and baboon error rates. Note that SimCLR most closely resembles baboon performance on this task while the relational network most closely corresponds with the human error rates. (c) t-SNE plot of reference shape embeddings from the relational network colored according to stimulus regularity calculated as the sum of the binary symbolic properties for each stimulus.
Figure 5: Performance on category task We trained ten relational and feedforward neural networks, which differed solely in their initial weight configurations, on a categorical same/different task. The task involved simple binary stimuli, each represented by two one-hot encoded features. Each feature could take one of 30 possible values, resulting in a total of 900 unique stimuli. For training, we randomly selected a training set of 30 stimuli, which constituted $3.\overline{3}\%$ of the entire stimulus set. The objective for the networks was to learn to make similarity judgments on this holdout set. During the testing phase, we evaluated the networks' performance on the remaining $96.\overline{6}\%$ of the stimuli, which they had not seen during training. The networks' were evaluated on their ability to accurately judge the similarity of previously unseen stimuli. Performance on the task was computed as accuracy on the task after binarizing the networks' similarity judgements by thresholding. Both the relational and feedforward networks achieved perfect accuracy on the training set. Notably, the relational network learned to perform the task more rapidly than the feedforward network. In terms of generalization, the relational network's performance on the holdout set was nearly perfect, while the feedforward network's performance was significantly lower. Accuracy on the task is higher initially in the relational network due to the relational bottlenecks sensitivity to the trivially separated features present in the input data.

A Relational Inductive Bias for Dimensional Abstraction in Neural Networks

TL;DR

Abstract

A Relational Inductive Bias for Dimensional Abstraction in Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (5)