Table of Contents
Fetching ...

Human-Like Geometric Abstraction in Large Pre-trained Neural Networks

Declan Campbell, Sreejan Kumar, Tyler Giallanza, Thomas L. Griffiths, Jonathan D. Cohen

TL;DR

This work investigates whether large pre-trained neural networks can exhibit human-like geometric abstraction, challenging the view that symbolic primitives are required. It tests three core biases—geometric complexity sensitivity, regularity sensitivity, and parts/relations decomposition—across three tasks and models (DINOv2, CLIP, ResNet-50). The results show that self-supervised transformers (DINOv2, CLIP) encode geometric complexity and reproduce the regularity bias more closely to humans than ResNet, though performance on the Geoclidean parts/relations task remains below human levels. The findings suggest geometric abstractions can emerge from scale and distributional learning, offering a connectionist alternative to symbol-based theories and guiding future architectural and training innovations to enhance relational reasoning in AI.

Abstract

Humans possess a remarkable capacity to recognize and manipulate abstract structure, which is especially apparent in the domain of geometry. Recent research in cognitive science suggests neural networks do not share this capacity, concluding that human geometric abilities come from discrete symbolic structure in human mental representations. However, progress in artificial intelligence (AI) suggests that neural networks begin to demonstrate more human-like reasoning after scaling up standard architectures in both model size and amount of training data. In this study, we revisit empirical results in cognitive science on geometric visual processing and identify three key biases in geometric visual processing: a sensitivity towards complexity, regularity, and the perception of parts and relations. We test tasks from the literature that probe these biases in humans and find that large pre-trained neural network models used in AI demonstrate more human-like abstract geometric processing.

Human-Like Geometric Abstraction in Large Pre-trained Neural Networks

TL;DR

This work investigates whether large pre-trained neural networks can exhibit human-like geometric abstraction, challenging the view that symbolic primitives are required. It tests three core biases—geometric complexity sensitivity, regularity sensitivity, and parts/relations decomposition—across three tasks and models (DINOv2, CLIP, ResNet-50). The results show that self-supervised transformers (DINOv2, CLIP) encode geometric complexity and reproduce the regularity bias more closely to humans than ResNet, though performance on the Geoclidean parts/relations task remains below human levels. The findings suggest geometric abstractions can emerge from scale and distributional learning, offering a connectionist alternative to symbol-based theories and guiding future architectural and training innovations to enhance relational reasoning in AI.

Abstract

Humans possess a remarkable capacity to recognize and manipulate abstract structure, which is especially apparent in the domain of geometry. Recent research in cognitive science suggests neural networks do not share this capacity, concluding that human geometric abilities come from discrete symbolic structure in human mental representations. However, progress in artificial intelligence (AI) suggests that neural networks begin to demonstrate more human-like reasoning after scaling up standard architectures in both model size and amount of training data. In this study, we revisit empirical results in cognitive science on geometric visual processing and identify three key biases in geometric visual processing: a sensitivity towards complexity, regularity, and the perception of parts and relations. We test tasks from the literature that probe these biases in humans and find that large pre-trained neural network models used in AI demonstrate more human-like abstract geometric processing.
Paper Structure (13 sections, 3 figures, 2 tables)

This paper contains 13 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Tasks. (A). Delayed-match to sample task involving hierarchically structured shapes generated using the Dreamcoder DSL. (B). Quadrilateral oddball task in which humans and machines are evaluated on their sensitivity to geometric regularity and symmetry. (C). Category judgment task involving geometric figures with hierarchical shapes.
  • Figure 2: Model embeddings predict human performance on DMTS task. a) t-SNE plot of model embeddings of 1000 stimuli generated from randomly sampled programs from the LoT model used in Sable-Meyer et al. (2022), colored according to stimulus MDL. Stimulus embeddings implicitly encode variation in MDL, that was also evaluated with regression analysis. b) Human choice and encoding reaction times on the task plotted with a linear regression between MDL and reaction time. $R^2$ values and $p$-values from fitting GLM model with same confounds as originally used in Sable-Meyer et al. (2022). c) Embedding metrics for all three models plotted against choice (top) and encoding (bottom) RT. $R^2$ values and $p$-values from GLM model with same confounds as originally used in Sable-Meyer et al. (2022).
  • Figure 3: Comparison of model biases to human/baboon performance on quadrilateral oddball task. A) Human, baboon, and symbolic model error rates on the quadrilateral oddball task reported by Sable-Meyer et al (2021), sorted by shape geometric regularity (most regular on the left to least regular on the right). B) Neural network error rates as evaluated on the same trials shown to human and baboon participants in the original task. C) Bar plot displaying r-values for statistical tests evaluating correspondence of model error rates with human and baboon error rates.