Table of Contents
Fetching ...

Do Mice Grok? Glimpses of Hidden Progress During Overtraining in Sensory Cortex

Tanishq Kumar, Blake Bordelon, Cengiz Pehlevan, Venkatesh N. Murthy, Samuel J. Gershman

TL;DR

The study investigates whether task-specific cortical representations continue to refine after behavior saturates in a mouse odor-discrimination task. By reanalyzing posterior piriform cortex data, it shows continued separation of target and non-target representations and increasing classifier margins during overtraining, predicting improved generalization to held-out odors. A synthetic model and a biologically plausible variant reproduce grokking-like late-time learning and link margin maximization to observed neural dynamics, offering a mechanistic explanation for overtraining reversal. The work suggests late-time feature learning in sensory cortex and draws parallels to deep networks, with implications for understanding generalization and robustness under distribution shifts.

Abstract

Does learning of task-relevant representations stop when behavior stops changing? Motivated by recent theoretical advances in machine learning and the intuitive observation that human experts continue to learn from practice even after mastery, we hypothesize that task-specific representation learning can continue, even when behavior plateaus. In a novel reanalysis of recently published neural data, we find evidence for such learning in posterior piriform cortex of mice following continued training on a task, long after behavior saturates at near-ceiling performance ("overtraining"). This learning is marked by an increase in decoding accuracy from piriform neural populations and improved performance on held-out generalization tests. We demonstrate that class representations in cortex continue to separate during overtraining, so that examples that were incorrectly classified at the beginning of overtraining can abruptly be correctly classified later on, despite no changes in behavior during that time. We hypothesize this hidden yet rich learning takes the form of approximate margin maximization; we validate this and other predictions in the neural data, as well as build and interpret a simple synthetic model that recapitulates these phenomena. We conclude by showing how this model of late-time feature learning implies an explanation for the empirical puzzle of overtraining reversal in animal learning, where task-specific representations are more robust to particular task changes because the learned features can be reused.

Do Mice Grok? Glimpses of Hidden Progress During Overtraining in Sensory Cortex

TL;DR

The study investigates whether task-specific cortical representations continue to refine after behavior saturates in a mouse odor-discrimination task. By reanalyzing posterior piriform cortex data, it shows continued separation of target and non-target representations and increasing classifier margins during overtraining, predicting improved generalization to held-out odors. A synthetic model and a biologically plausible variant reproduce grokking-like late-time learning and link margin maximization to observed neural dynamics, offering a mechanistic explanation for overtraining reversal. The work suggests late-time feature learning in sensory cortex and draws parallels to deep networks, with implications for understanding generalization and robustness under distribution shifts.

Abstract

Does learning of task-relevant representations stop when behavior stops changing? Motivated by recent theoretical advances in machine learning and the intuitive observation that human experts continue to learn from practice even after mastery, we hypothesize that task-specific representation learning can continue, even when behavior plateaus. In a novel reanalysis of recently published neural data, we find evidence for such learning in posterior piriform cortex of mice following continued training on a task, long after behavior saturates at near-ceiling performance ("overtraining"). This learning is marked by an increase in decoding accuracy from piriform neural populations and improved performance on held-out generalization tests. We demonstrate that class representations in cortex continue to separate during overtraining, so that examples that were incorrectly classified at the beginning of overtraining can abruptly be correctly classified later on, despite no changes in behavior during that time. We hypothesize this hidden yet rich learning takes the form of approximate margin maximization; we validate this and other predictions in the neural data, as well as build and interpret a simple synthetic model that recapitulates these phenomena. We conclude by showing how this model of late-time feature learning implies an explanation for the empirical puzzle of overtraining reversal in animal learning, where task-specific representations are more robust to particular task changes because the learned features can be reused.

Paper Structure

This paper contains 21 sections, 11 equations, 10 figures.

Figures (10)

  • Figure 1: (Left) Behavior of mice on a binary discrimination task of odors, where mice indicate their selected choice by licking left or right. Y-axis is fraction of licks left vs right on a given day. Day 8 is the first day of overtraining. Correct lick for non-target is left, and right for target. The mice can discriminate near-perfectly as overtraining begins. Reproduced with modification from berners2023experience. (right) Increase in decoding accuracy from 10-fold linear discriminant analysis for each session. Shaded colored lines in background show standard errors for each mouse.
  • Figure 2: Representational similarity matrices plotting average pairwise correlation between target and nontarget population responses from piriform cortex on the first (top) and last (bottom) days of overtraining. We drop the last few days of data for Mouse T due to inconsistencies in data; methodological details in \ref{['appdx: method']}. We compare the separations (anticorrelation) above to that of random/ablated baselines in Appendix \ref{['appdx: ablations']}, finding them significantly larger.
  • Figure 3: Neural representations (projected onto the top 2 principal components) of target and nontarget odors on the first (top) and last (bottom) day of overtraining for each mouse, showing qualitative separation during overtraining measured quantitatively in the representational similarity matrices in Figure \ref{['fig:rsa']}. Cluster centers plotted for ease of visual comparison.
  • Figure 4: The average distance from the decision boundary within the smallest 1% and 5% distances extracted from a linear support vector machine trained on population data from PPC. We normalize margins for each mice so data are comparable across mice, shaded bands show standard errors.
  • Figure 5: Recapitulating and interpreting mouse piriform cortex dynamics in a simple model. Top left: training and test loss over the course of training. Top right: margin over the course of training. Bottom: projections of the top 2 principal components for several training epochs. Notice how the target and probes (test trials) separate during overtraining (Epochs 1000-9000) despite the target never being trained on a probe example.
  • ...and 5 more figures