Spectral Introspection Identifies Group Training Dynamics in Deep Neural Networks for Neuroimaging
Bradley T. Baker, Vince D. Calhoun, Sergey M. Plis
TL;DR
The paper addresses the challenge of interpreting training dynamics in deep neural networks for neuroimaging by introducing AutoSpec, a gradient-spectrum introspection framework that analyzes gradients during training via singular value decomposition. By leveraging reverse-mode automatic differentiation, AutoSpec enables on-the-fly, group-specific analyses of gradient spectra, facilitating comparisons across sample groups without disrupting training. The authors demonstrate the method on numerical datasets and neuroimaging data (including COBRE), showing that gradient spectra are task- and architecture-dependent and that group differences can manifest in specific layers and activations. While the approach reveals rich insights, it incurs substantial computational overhead, motivating future work on analytic updates and scalability to larger architectures. Overall, AutoSpec provides a practical tool for probing learning dynamics and group-specific effects in neuroimaging pipelines, with potential implications for bias detection and interpretability in clinical contexts.
Abstract
Neural networks, whice have had a profound effect on how researchers study complex phenomena, do so through a complex, nonlinear mathematical structure which can be difficult for human researchers to interpret. This obstacle can be especially salient when researchers want to better understand the emergence of particular model behaviors such as bias, overfitting, overparametrization, and more. In Neuroimaging, the understanding of how such phenomena emerge is fundamental to preventing and informing users of the potential risks involved in practice. In this work, we present a novel introspection framework for Deep Learning on Neuroimaging data, which exploits the natural structure of gradient computations via the singular value decomposition of gradient components during reverse-mode auto-differentiation. Unlike post-hoc introspection techniques, which require fully-trained models for evaluation, our method allows for the study of training dynamics on the fly, and even more interestingly, allow for the decomposition of gradients based on which samples belong to particular groups of interest. We demonstrate how the gradient spectra for several common deep learning models differ between schizophrenia and control participants from the COBRE study, and illustrate how these trajectories may reveal specific training dynamics helpful for further analysis.
