Computer Vision Modeling of the Development of Geometric and Numerical Concepts in Humans
Zekun Wang, Sashank Varma
TL;DR
This study investigates whether computer vision models exhibit developmental alignment with human mathematical thinking as they learn from image classification. Using a standard ImageNet-trained ResNet-50, it assesses (i) the trajectory of sensitivity to geometric/topological concepts and (ii) the emergence of a human-like mental number line (MNL) through numerosity representations. Results show partial developmental alignment in GT concepts, with four classes reaching human-like improvements while three classes lag, suggesting some GT concepts may arise from perceptual learning while others require core knowledge or instruction. In numerosity, the model gradually develops distance- and ratio-based effects and a sharpening MNL-like representation, supporting the view that visual experience can yield human-like numerical structure in a learned system. Overall, the work highlights both the promise and current limits of CV models as models of developmental mathematical cognition and points to avenues for broader architectures and data to further illuminate these trajectories.
Abstract
Mathematical thinking is a fundamental aspect of human cognition. Cognitive scientists have investigated the mechanisms that underlie our ability to thinking geometrically and numerically, to take two prominent examples, and developmental scientists have documented the trajectories of these abilities over the lifespan. Prior research has shown that computer vision (CV) models trained on the unrelated task of image classification nevertheless learn latent representations of geometric and numerical concepts similar to those of adults. Building on this demonstrated cognitive alignment, the current study investigates whether CV models also show developmental alignment: whether their performance improvements across training to match the developmental progressions observed in children. In a detailed case study of the ResNet-50 model, we show that this is the case. For the case of geometry and topology, we find developmental alignment for some classes of concepts (Euclidean Geometry, Geometrical Figures, Metric Properties, Topology) but not others (Chiral Figures, Geometric Transformations, Symmetrical Figures). For the case of number, we find developmental alignment in the emergence of a human-like ``mental number line'' representation with experience. These findings show the promise of computer vision models for understanding the development of mathematical understanding in humans. They point the way to future research exploring additional model architectures and building larger benchmarks.
