Reducing self-supervised learning complexity improves weakly-supervised classification performance in computational pathology
Tim Lenz, Omar S. M. El Nahhas, Marta Ligero, Jakob Nikolas Kather
TL;DR
The paper addresses the barrier of heavy resource demands in self-supervised learning for computational pathology. It systematically reduces SSL complexity via data-volume reductions, encoder-stage modifications, and sampling strategy innovations within MoCo-v3 on a Swin Transformer backbone. Key contributions include showing that 50% SSL data suffices for downstream gene mutation tasks, demonstrating gains from multi-stage feature fusion, and introducing dynamic/negative sampling that outperform semantically relevant sampling, all while reducing training time by up to 90%. These findings enable effective foundation-model-style SSL for breast cancer histopathology on consumer hardware, broadening access and reducing costs for medical centers.
Abstract
Deep Learning models have been successfully utilized to extract clinically actionable insights from routinely available histology data. Generally, these models require annotations performed by clinicians, which are scarce and costly to generate. The emergence of self-supervised learning (SSL) methods remove this barrier, allowing for large-scale analyses on non-annotated data. However, recent SSL approaches apply increasingly expansive model architectures and larger datasets, causing the rapid escalation of data volumes, hardware prerequisites, and overall expenses, limiting access to these resources to few institutions. Therefore, we investigated the complexity of contrastive SSL in computational pathology in relation to classification performance with the utilization of consumer-grade hardware. Specifically, we analyzed the effects of adaptations in data volume, architecture, and algorithms on downstream classification tasks, emphasizing their impact on computational resources. We trained breast cancer foundation models on a large public patient cohort and validated them on various downstream classification tasks in a weakly supervised manner on two external public patient cohorts. Our experiments demonstrate that we can improve downstream classification performance whilst reducing SSL training duration by 90%. In summary, we propose a set of adaptations which enable the utilization of SSL in computational pathology in non-resource abundant environments.
