A Neuroscience-Inspired Dual-Process Model of Compositional Generalization
Alex Noviello, Claas Beger, Jacob Groner, Kevin Ellis, Weinan Sun
TL;DR
This work tackles the problem of systematic compositional generalization in neural networks. It introduces Mirage, a dual-process architecture that combines a fast, meta-trained Neural Decomposer (System 1) with an explicit Schema Engine (System 2) to extract and apply prioritized schemas in an iterative refinement loop. On SCAN, Mirage achieves $>$99\% accuracy across all splits, significantly surpassing monolithic transformers and highlighting the importance of separating procedural decomposition from declarative knowledge. The study provides a concrete, interpretable cognitive processing account for how compositional reasoning can arise from modular architectures with potential for broader applicability in flexible reasoning tasks.
Abstract
Deep learning models struggle with systematic compositional generalization, a hallmark of human cognition. We propose \textsc{Mirage}, a neuro-inspired dual-process model that offers a processing account for this ability. It combines a fast, intuitive ``System~1'' (a meta-trained Transformer) with a deliberate, rule-based ``System~2'' (a Schema Engine), mirroring the brain's neocortical and hippocampal--prefrontal circuits. Trained to perform general, single-step decomposition on a stream of random grammars, Mirage achieves $>$99\% accuracy on all splits of the SCAN benchmark in a task-agnostic setting. Ablations confirm that the model's systematic behavior emerges from the architectural interplay of its two systems, particularly its use of explicit, prioritized schemas and iterative refinement. In line with recent progress on recursive/recurrent Transformer approaches, Mirage preserves an iterative neural update while externalizing declarative control into an interpretable schema module. Our work provides a concrete computational model for interpreting how compositional reasoning can arise from a modular cognitive architecture.
