A Small Math Model: Recasting Strategy Choice Theory in an LLM-Inspired Architecture
Roussel Rahman, Jeff Shrager
TL;DR
The paper reframes Strategy Choice Theory (SCT) within a contemporary LLM-inspired Small Math Model (SMM) to study how children adaptively select arithmetic strategies. It introduces number embeddings, a gated attention mechanism, and a Gaussian curriculum to train counting and addition with confidence-based switching between strategies. Key findings show counting can scaffold learning of addition, with timing of introduction causing transient interference that dissipates, and biases decreasing with experience, aligning with SCT predictions. This work provides a mechanistic, interpretable platform for exploring number sense and symbolic reasoning in AI, with potential to scale to broader numerical tasks and support adaptive strategy discovery.
Abstract
Strategy Choice Theory (SCT; Siegler and Shrager, 1984; Siegler, 2000) explains important aspects of children's arithmetic learning based upon principles including learning from developmentally naturalistic data, probabilistic representation, confidence-based retrieval, and the phase-like importance of scaffolding strategies, such as finger-counting. Here we recast SCT as a ``Small Math Model'' (SMM), employing a neural-network-based architecture analogous to LLMs. The SMM extends SCT to include counting practice, symbol (number) embedding, and gated attention. Similar to earlier work, the SMM demonstrates constructive and destructive interference between counting and addition, and the ``wave-like'' use of finger-counting as sum recall improves. We plan to extend the SMM to later aspects of the decades-long SCT program, including adaptive strategy choice and eventually strategy discovery, providing a unified platform to investigate the understanding of numerical characteristics and relationships essential for mathematical reasoning -- as it can emerge in LLM-based agents.
