Manifolds and Modules: How Function Develops in a Neural Foundation Model
Johannes Bertram, Luciano Dyballa, T. Anderson Keller, Savik Kinger, Steven W. Zucker
TL;DR
The paper probes a state-of-the-art neural foundation model of activity by constructing decoding and encoding manifolds and tracking joint temporal trajectories across encoder, recurrent, and readout modules. By comparing these internal representations to mouse visual cortex data, it finds the recurrent module most strongly supports temporal discrimination, while the encoder shows limited temporal dynamics and the readout introduces rich variability via many feature maps. The authors demonstrate how manifold-based analyses can reveal brain-like structure and where foundation models diverge from biology, suggesting architectural tweaks to improve interpretability and biological plausibility without sacrificing predictive power. This approach advances interpretability of complex foundation models and informs design choices for future neuro-inspired AI systems.
Abstract
Foundation models have shown remarkable success in fitting biological visual systems; however, their black-box nature inherently limits their utility for understanding brain function. Here, we peek inside a SOTA foundation model of neural activity (Wang et al., 2025) as a physiologist might, characterizing each 'neuron' based on its temporal response properties to parametric stimuli. We analyze how different stimuli are represented in neural activity space by building decoding manifolds, and we analyze how different neurons are represented in stimulus-response space by building neural encoding manifolds. We find that the different processing stages of the model (i.e., the feedforward encoder, recurrent, and readout modules) each exhibit qualitatively different representational structures in these manifolds. The recurrent module shows a jump in capabilities over the encoder module by 'pushing apart' the representations of different temporal stimulus patterns; while the readout module achieves biological fidelity by using numerous specialized feature maps rather than biologically plausible mechanisms. Overall, we present this work as a study of the inner workings of a prominent neural foundation model, gaining insights into the biological relevance of its internals through the novel analysis of its neurons' joint temporal response patterns.
