Mixture of Cognitive Reasoners: Modular Reasoning with Brain-Like Specialization
Badr AlKhamissi, C. Nicolò De Sabbata, Greta Tuckute, Zeming Chen, Martin Schrimpf, Antoine Bosselut
TL;DR
MiCRo introduces a brain-inspired modular transformer that partitions every layer into four expert modules corresponding to language, logic, social reasoning, and world knowledge. A three-stage curriculum induces specialization, with an initial expert-focused pretraining, router calibration, and end-to-end instruction finetuning, enabling both interpretability and inference-time steering. Empirical results show interpretable routing patterns, causal ablations confirming functional contributions, and strong alignment with human behavior on CogBench, alongside competitive performance on reasoning benchmarks. Overall, MiCRo demonstrates that cognitively grounded modularity yields more transparent, steerable, and human-aligned language models without sacrificing performance, suggesting a scalable path toward brain-aligned AI systems.
Abstract
Human cognitive behavior arises from the interaction of specialized brain networks dedicated to distinct functions, such as language, logic, and social reasoning. Inspired by this organization, we propose Mixture of Cognitive Reasoners (MiCRo): a modular, transformer-based architecture post-trained with a curriculum that induces functional specialization across experts. Concretely, we partition the layers of a pretrained language model into four expert modules aligned with well-studied cognitive networks in the human brain. MiCRo offers three key advantages over standard language models. (1) The specialized experts are interpretable and causally meaningful -- ablating a module causes substantial drops on benchmarks requiring its specialized domain. (2) MiCRo's behavior can be dynamically steered at inference time by routing tokens to particular experts (e.g., favoring social over logical reasoning), enabling fine-grained control over outputs. (3) MiCRo outperforms or matches comparable baselines on both machine-learning reasoning benchmarks (e.g., GSM8K, BBH) and alignment to human behavior (CogBench), while maintaining interpretability. Taken together, cognitively grounded functional specialization yields models that are both more human-like and more human-interpretable.
