Table of Contents
Fetching ...

Understanding the Countably Infinite: Neural Network Models of the Successor Function and its Acquisition

Vima Gupta, Sashank Varma

TL;DR

Problem: understanding how learners acquire the successor function and the countably infinite. Approach: compare two neural encodings—count-list one-hot and place-value encoding aligned with number naming—trained on $D:[0,98] \rightarrow [1,99]$ to learn $S(N)=N+1$. Key findings: the place-value model shows partial generalization ($\approx$24% test accuracy), systematic latent structure across tens boundaries, and a clearer sensitivity to place-value boundaries, while the count-list model fails to generalize; curriculum learning sharpens representations of smaller numbers as training data expand. Significance: supports a mechanism by which place-value structure and boundary-aware representations facilitate understanding of the countably infinite and motivates recurrence-based architectures to simulate counting processes.

Abstract

As children enter elementary school, their understanding of the ordinal structure of numbers transitions from a memorized count list of the first 50-100 numbers to knowing the successor function and understanding the countably infinite. We investigate this developmental change in two neural network models that learn the successor function on the pairs (N, N+1) for N in (0, 98). The first uses a one-hot encoding of the input and output values and corresponds to children memorizing a count list, while the second model uses a place-value encoding and corresponds to children learning the language rules for naming numbers. The place-value model showed a predicted drop in representational similarity across tens boundaries. Counting across a tens boundary can be understood as a vector operation in 2D space, where the numbers with the same tens place are organized in a linearly separable manner, whereas those with the same ones place are grouped together. A curriculum learning simulation shows that, in the expanding numerical environment of the developing child, representations of smaller numbers continue to be sharpened even as larger numbers begin to be learned. These models set the stage for future work using recurrent architectures to move beyond learning the successor function to simulating the counting process more generally, and point towards a deeper understanding of what it means to understand the countably infinite.

Understanding the Countably Infinite: Neural Network Models of the Successor Function and its Acquisition

TL;DR

Problem: understanding how learners acquire the successor function and the countably infinite. Approach: compare two neural encodings—count-list one-hot and place-value encoding aligned with number naming—trained on to learn . Key findings: the place-value model shows partial generalization (24% test accuracy), systematic latent structure across tens boundaries, and a clearer sensitivity to place-value boundaries, while the count-list model fails to generalize; curriculum learning sharpens representations of smaller numbers as training data expand. Significance: supports a mechanism by which place-value structure and boundary-aware representations facilitate understanding of the countably infinite and motivates recurrence-based architectures to simulate counting processes.

Abstract

As children enter elementary school, their understanding of the ordinal structure of numbers transitions from a memorized count list of the first 50-100 numbers to knowing the successor function and understanding the countably infinite. We investigate this developmental change in two neural network models that learn the successor function on the pairs (N, N+1) for N in (0, 98). The first uses a one-hot encoding of the input and output values and corresponds to children memorizing a count list, while the second model uses a place-value encoding and corresponds to children learning the language rules for naming numbers. The place-value model showed a predicted drop in representational similarity across tens boundaries. Counting across a tens boundary can be understood as a vector operation in 2D space, where the numbers with the same tens place are organized in a linearly separable manner, whereas those with the same ones place are grouped together. A curriculum learning simulation shows that, in the expanding numerical environment of the developing child, representations of smaller numbers continue to be sharpened even as larger numbers begin to be learned. These models set the stage for future work using recurrent architectures to move beyond learning the successor function to simulating the counting process more generally, and point towards a deeper understanding of what it means to understand the countably infinite.
Paper Structure (9 sections, 9 figures)

This paper contains 9 sections, 9 figures.

Figures (9)

  • Figure 1: Accuracy plot for the count list model.
  • Figure 2: Successive cosine similarities for count list model.
  • Figure 3: Accuracy plot for place value model.
  • Figure 4: Successive cosine similarity plot for the place-value model showing a recurring pattern at the boundaries.
  • Figure 5: Plots of the hidden layer representations of the numbers *9 and *0, reduced by MDS to 2 dimensions, for the count list model, showcasing overlapping representations
  • ...and 4 more figures