Table of Contents
Fetching ...

A Unified Geometric Space Bridging AI Models and the Human Brain

Silin Chen, Yuzhong Chen, Zifan Wang, Junhao Wang, Zifeng Jia, Keith M Kendrick, Tuo Zhang, Lin Zhao, Dezhong Yao, Tianming Liu, Xi Jiang

TL;DR

This work introduces Brain-like Space, a seven-dimensional geometric framework that maps a model's spatial attention topology onto canonical human brain networks to enable cross-domain comparisons of intelligence. By analyzing 151 Transformer-based architectures across vision, language, and multimodal modalities, the authors show an arc-shaped distribution of brain-likeness in this space, with pretraining principles and positional encoding guiding topological organization beyond modality. They define a brain-likeness score as the sum of attention-head projections onto the first principal component and demonstrate that brain-likeness only partially tracks downstream accuracy, suggesting brain-like organization is an independent design objective. The study presents a graph-based methodology for quantifying brain-AI similarity, offering a unified framework to guide architecture and training choices toward brain-inspired organizational principles with potential implications for AI interpretability and cross-domain alignment.

Abstract

For decades, neuroscientists and computer scientists have pursued a shared ambition: to understand intelligence and build it. Modern artificial neural networks now rival humans in language, perception, and reasoning, yet it is still largely unknown whether these artificial systems organize information as the brain does. Existing brain-AI alignment studies have shown the striking correspondence between the two systems, but such comparisons remain bound to specific inputs and tasks, offering no common ground for comparing how AI models with different kinds of modalities-vision, language, or multimodal-are intrinsically organized. Here we introduce a groundbreaking concept of Brain-like Space: a unified geometric space in which every AI model can be precisely situated and compared by mapping its intrinsic spatial attention topological organization onto canonical human functional brain networks, regardless of input modality, task, or sensory domain. Our extensive analysis of 151 Transformer-based models spanning state-of-the-art large vision models, large language models, and large multimodal models uncovers a continuous arc-shaped geometry within this space, reflecting a gradual increase of brain-likeness; different models exhibit distinct distribution patterns within this geometry associated with different degrees of brain-likeness, shaped not merely by their modality but by whether the pretraining paradigm emphasizes global semantic abstraction and whether the positional encoding scheme facilitates deep fusion across different modalities. Moreover, the degree of brain-likeness for a model and its downstream task performance are not "identical twins". The Brain-like Space provides the first unified framework for situating, quantifying, and comparing intelligence across domains, revealing the deep organizational principles that bridge machines and the brain.

A Unified Geometric Space Bridging AI Models and the Human Brain

TL;DR

This work introduces Brain-like Space, a seven-dimensional geometric framework that maps a model's spatial attention topology onto canonical human brain networks to enable cross-domain comparisons of intelligence. By analyzing 151 Transformer-based architectures across vision, language, and multimodal modalities, the authors show an arc-shaped distribution of brain-likeness in this space, with pretraining principles and positional encoding guiding topological organization beyond modality. They define a brain-likeness score as the sum of attention-head projections onto the first principal component and demonstrate that brain-likeness only partially tracks downstream accuracy, suggesting brain-like organization is an independent design objective. The study presents a graph-based methodology for quantifying brain-AI similarity, offering a unified framework to guide architecture and training choices toward brain-inspired organizational principles with potential implications for AI interpretability and cross-domain alignment.

Abstract

For decades, neuroscientists and computer scientists have pursued a shared ambition: to understand intelligence and build it. Modern artificial neural networks now rival humans in language, perception, and reasoning, yet it is still largely unknown whether these artificial systems organize information as the brain does. Existing brain-AI alignment studies have shown the striking correspondence between the two systems, but such comparisons remain bound to specific inputs and tasks, offering no common ground for comparing how AI models with different kinds of modalities-vision, language, or multimodal-are intrinsically organized. Here we introduce a groundbreaking concept of Brain-like Space: a unified geometric space in which every AI model can be precisely situated and compared by mapping its intrinsic spatial attention topological organization onto canonical human functional brain networks, regardless of input modality, task, or sensory domain. Our extensive analysis of 151 Transformer-based models spanning state-of-the-art large vision models, large language models, and large multimodal models uncovers a continuous arc-shaped geometry within this space, reflecting a gradual increase of brain-likeness; different models exhibit distinct distribution patterns within this geometry associated with different degrees of brain-likeness, shaped not merely by their modality but by whether the pretraining paradigm emphasizes global semantic abstraction and whether the positional encoding scheme facilitates deep fusion across different modalities. Moreover, the degree of brain-likeness for a model and its downstream task performance are not "identical twins". The Brain-like Space provides the first unified framework for situating, quantifying, and comparing intelligence across domains, revealing the deep organizational principles that bridge machines and the brain.

Paper Structure

This paper contains 17 sections, 14 equations, 7 figures.

Figures (7)

  • Figure 1: Distribution of Transformer-based models in the Brain-like Space.a, Visualization of attention heads across nine model categories in the PCA-projected two-dimensional Brain-like Space (PC1: 82.77% variance explained; PC2: 13.30% variance explained). Points that indicate attention heads are colored by model categories. Dashed lines indicate the boundaries of four clusters (C1-C4) of all points according to their similarity to the seven functional brain networks. b, The Silhouette coefficients as a function of the number of clusters, supporting the choice of k=4. c, Distribution of the four clusters in the two-dimensional Brain-like Space. Points are colored by the cluster index (C1-C4). d, Radar plot of the centroid similarity profiles for each cluster, demonstrating a gradual increase in overall similarity to the seven functional brain networks from C1 to C4. e, Model category-specific visualizations. Each subplot shows the distribution of one model category in the Brain-like Space and the proportion of attention heads belonging to clusters C1–C4.
  • Figure 2: Influence of different pretraining paradigms on the brain-like distribution of ViT series and its variants.
  • Figure 3: (1) Data augmentation strategies.a, Visualization of attention heads from ViT-orig, ViT-augreg, and DeiT3 in the Brain-like Space. Colored pentagrams denote the centroid position of each model class. b, Donut charts of matched-head type distribution with the seven canonical functional brain networks for representative base models from ViT-orig, ViT-augreg, and DeiT3. The concentric rings correspond to different models (inner to outer), with labels indicating the number and percentage of matched heads. (2) Training objectives.c, Visualization of attention heads from BEiT, BEiTv2, DINO, DINOv2, DINOv3, and MAE in the Brain-like Space. d, Donut charts of matched-head type distribution with the seven canonical functional brain networks for representative base models from BEiT, BEiTv2, DINO, DINOv3. e, Layer-wise bar plots of matched-head types for representative models of DINOv2 and MAE. (3) Distillation of CNN teachers.f, Visualization of attention heads from DeiT and ViT-orig in the Brain-like Space. g, Line plot showing the proportion of matched heads relative to total heads in DeiT models across different scales. h, Layer-wise bar plots of matched-head types for DeiT models at varying scales.
  • Figure 4: Influence of positional encoding scheme on the brain-like distribution of LLMs and LMMs.
  • Figure 5: a, Visualization of attention heads from ten representative single-modality LLM series, including GPT-2, BERT, DeepSeek, Gemma, LLaMA, Mistral, Moonlight, OLMo, Qwen, and SmolLM in the Brain-like Space. A zoomed view highlights detailed local distributions. b, Donut chart of matched-head type distribution with the seven canonical functional brain networks for representative models from the ten LLM series. c, Visualization of attention heads from six representative multimodal LMM series, including CLIP, BLIP, DeepSeek, Gemma, Kimi, and Qwen in the Brain-like Space. A zoomed view highlights detailed local distributions. d, Donut chart of matched-head type distribution with the seven canonical functional brain networks for representative models from the four categories: LMM-vision, LMM-language, LMM-vision-RoPE, and LMM-language-RoPE.
  • ...and 2 more figures