Brain-Like Processing Pathways Form in Models With Heterogeneous Experts

Jack Cook; Danyal Akarca; Rui Ponte Costa; Jascha Achterberg

Brain-Like Processing Pathways Form in Models With Heterogeneous Experts

Jack Cook, Danyal Akarca, Rui Ponte Costa, Jascha Achterberg

TL;DR

This work addresses how brain-like processing pathways emerge among heterogeneous neural regions by extending the HMoE architecture with three inductive biases: a routing-cost penalizing use of large experts, a task-performance–scaled routing penalty, and random expert dropout. These biases yield a Mixture-of-Pathways in which pathways are stable, self-sufficient, and task-discriminative, and whose dynamics resemble cortical–subcortical interactions observed in the brain, including the multiple-demand system during difficult tasks. Through extensive evaluation on 82 Mod-Cog time-series cognitive tasks, the authors demonstrate that the MoP model exhibits task difficulty–dependent pathway usage, dynamic reallocation during learning, and brain-like learning trajectories, distinguishing it from baseline HMoE. The findings offer a framework for neuroscience investigations into pathway formation and present a pathway-aware, resource-efficient approach for future MoE-based machine learning systems.

Abstract

The brain is made up of a vast set of heterogeneous regions that dynamically organize into pathways as a function of task demands. Examples of such pathways can be found in the interactions between cortical and subcortical networks during learning, or in sub-networks specializing for task characteristics such as difficulty or modality. Despite the large role these pathways play in cognition, the mechanisms through which brain regions organize into pathways remain unclear. In this work, we use an extension of the Heterogeneous Mixture-of-Experts architecture to show that heterogeneous regions do not form processing pathways by themselves, implying that the brain likely implements specific constraints which result in the reliable formation of pathways. We identify three biologically relevant inductive biases that encourage pathway formation: a routing cost imposed on the use of more complex regions, a scaling factor that reduces this cost when task performance is low, and randomized expert dropout. When comparing our resulting \textit{Mixture-of-Pathways} model with the brain, we observe that the artificial pathways in our model match how the brain uses cortical and subcortical systems to learn and solve tasks of varying difficulty. In summary, we introduce a novel framework for investigating how the brain forms task-specific pathways through inductive biases, and the effects these biases have on the behavior of Mixture-of-Experts models.

Brain-Like Processing Pathways Form in Models With Heterogeneous Experts

TL;DR

Abstract

Brain-Like Processing Pathways Form in Models With Heterogeneous Experts

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)