DCL-SE: Dynamic Curriculum Learning for Spatiotemporal Encoding of Brain Imaging
Meihua Zhou, Xinyu Tong, Jiarui Zhao, Min Cheng, Li Yang, Lei Tian, Nan Wan
TL;DR
This work tackles the challenge of extracting clinically actionable insights from high-dimensional brain imaging data under limited labeled samples. It introduces DaSE, a two-stage encoding-decoding pipeline that first uses Approximate Rank Pooling (ARP) to convert 3D MRI volumes into compact 2D dynamic representations and then applies Dynamic Curriculum Learning (DCL) guided by a Dynamic Group Mechanism (DGM) to progressively refine features from global anatomy to subtle pathology. The approach yields strong accuracy, robustness, and interpretability across classification, segmentation, and brain-age prediction tasks, outperforming many 2D/3D baselines and large foundation models in data-limited clinical settings. By bridging 3D volumetric data with efficient 2D networks and providing interpretable progressive decoding, DCL-SE demonstrates practical potential for scalable, privacy-conscious neuroimaging analysis and establishes a path for integrating lightweight, task-specific models with large-scale pretrained systems. The findings highlight the value of compact, adaptive architectures in the era of massive pretrained models, and suggest broader applicability to other medical imaging domains.
Abstract
High-dimensional neuroimaging analyses for clinical diagnosis are often constrained by compromises in spatiotemporal fidelity and by the limited adaptability of large-scale, general-purpose models. To address these challenges, we introduce Dynamic Curriculum Learning for Spatiotemporal Encoding (DCL-SE), an end-to-end framework centered on data-driven spatiotemporal encoding (DaSE). We leverage Approximate Rank Pooling (ARP) to efficiently encode three-dimensional volumetric brain data into information-rich, two-dimensional dynamic representations, and then employ a dynamic curriculum learning strategy, guided by a Dynamic Group Mechanism (DGM), to progressively train the decoder, refining feature extraction from global anatomical structures to fine pathological details. Evaluated across six publicly available datasets, including Alzheimer's disease and brain tumor classification, cerebral artery segmentation, and brain age prediction, DCL-SE consistently outperforms existing methods in accuracy, robustness, and interpretability. These findings underscore the critical importance of compact, task-specific architectures in the era of large-scale pretrained networks.
