From Cognition to Computation: A Comparative Review of Human Attention and Transformer Architectures

Minglu Zhao; Dehong Xu; Tao Gao

From Cognition to Computation: A Comparative Review of Human Attention and Transformer Architectures

Minglu Zhao, Dehong Xu, Tao Gao

TL;DR

The paper investigates how human attention and Transformer attention align and diverge, focusing on capacity constraints, attentional pathways, and intentional control. It frames the Transformer as implementing a self-attention mechanism and extends to multi-head attention, with explicit formulas $Attention(Q,K,V)=softmax((QK^T)/sqrt(d_k))V$ and $MultiHead(Q,K,V)=Concat(head_1,...,head_h)W^O$, where each head $head_i=Attention(QW^Q_i,KW^K_i,VW^V_i)$. The authors present a structured comparative analysis across vision, language, and agency, identifying similarities in selective attention and contextual integration, but highlighting important differences in resource limits and agency. They argue for interdisciplinary exploration to derive resource-aware, interpretable representations and potentially explicit agency mechanisms in AI.

Abstract

Attention is a cornerstone of human cognition that facilitates the efficient extraction of information in everyday life. Recent developments in artificial intelligence like the Transformer architecture also incorporate the idea of attention in model designs. However, despite the shared fundamental principle of selectively attending to information, human attention and the Transformer model display notable differences, particularly in their capacity constraints, attention pathways, and intentional mechanisms. Our review aims to provide a comparative analysis of these mechanisms from a cognitive-functional perspective, thereby shedding light on several open research questions. The exploration encourages interdisciplinary efforts to derive insights from human attention mechanisms in the pursuit of developing more generalized artificial intelligence.

From Cognition to Computation: A Comparative Review of Human Attention and Transformer Architectures

TL;DR

and

, where each head

. The authors present a structured comparative analysis across vision, language, and agency, identifying similarities in selective attention and contextual integration, but highlighting important differences in resource limits and agency. They argue for interdisciplinary exploration to derive resource-aware, interpretable representations and potentially explicit agency mechanisms in AI.

Abstract

Paper Structure (18 sections, 3 equations)

This paper contains 18 sections, 3 equations.

Introduction
Attention modeling and Transformer architecture
Self-attention mechanism
Multi-head attention
Comparative analysis of human attention and attention in Transformers
Similarities
Selective attention
Contextual understanding
Differences
Capacity constraints
Attention Pathways
Intentional nature
Potential Directions
Is emulating human capacity constraints beneficial?
How can models adopt a resource-rational approach from human attention?
...and 3 more sections

From Cognition to Computation: A Comparative Review of Human Attention and Transformer Architectures

TL;DR

Abstract

From Cognition to Computation: A Comparative Review of Human Attention and Transformer Architectures

Authors

TL;DR

Abstract

Table of Contents