Table of Contents
Fetching ...

Revisiting Data Scaling Law for Medical Segmentation

Yuetan Chu, Zhongyi Han, Gongning Luo, Xin Gao

TL;DR

The paper investigates how segmentation performance scales with training data in medical imaging across 15 tasks and 4 modalities, validating a power-law relationship using BCE loss with Res-UNet and Swin-UNet backbones. It leverages deformation-based augmentation rooted in topological principles to enhance data efficiency, comparing random elastic deformation (RED), registration-based augmentation (RegDA), and a generated deformation augmentation (GenDA). RegDA and GenDA accelerate convergence and reduce data requirements, with GenDA providing the strongest gains, even without added external data. The findings suggest topology-aware augmentations can break conventional scaling laws, enabling more efficient, lower-cost medical segmentation models, though limitations include 2D experiments and the need for validation in 3D and more diverse pathologies.

Abstract

The population loss of trained deep neural networks often exhibits power law scaling with the size of the training dataset, guiding significant performance advancements in deep learning applications. In this study, we focus on the scaling relationship with data size in the context of medical anatomical segmentation, a domain that remains underexplored. We analyze scaling laws for anatomical segmentation across 15 semantic tasks and 4 imaging modalities, demonstrating that larger datasets significantly improve segmentation performance, following similar scaling trends. Motivated by the topological isomorphism in images sharing anatomical structures, we evaluate the impact of deformation-guided augmentation strategies on data scaling laws, specifically random elastic deformation and registration-guided deformation. We also propose a novel, scalable image augmentation approach that generates diffeomorphic mappings from geodesic subspace based on image registration to introduce realistic deformation. Our experimental results demonstrate that both registered and generated deformation-based augmentation considerably enhance data utilization efficiency. The proposed generated deformation method notably achieves superior performance and accelerated convergence, surpassing standard power law scaling trends without requiring additional data. Overall, this work provides insights into the understanding of segmentation scalability and topological variation impact in medical imaging, thereby leading to more efficient model development with reduced annotation and computational costs.

Revisiting Data Scaling Law for Medical Segmentation

TL;DR

The paper investigates how segmentation performance scales with training data in medical imaging across 15 tasks and 4 modalities, validating a power-law relationship using BCE loss with Res-UNet and Swin-UNet backbones. It leverages deformation-based augmentation rooted in topological principles to enhance data efficiency, comparing random elastic deformation (RED), registration-based augmentation (RegDA), and a generated deformation augmentation (GenDA). RegDA and GenDA accelerate convergence and reduce data requirements, with GenDA providing the strongest gains, even without added external data. The findings suggest topology-aware augmentations can break conventional scaling laws, enabling more efficient, lower-cost medical segmentation models, though limitations include 2D experiments and the need for validation in 3D and more diverse pathologies.

Abstract

The population loss of trained deep neural networks often exhibits power law scaling with the size of the training dataset, guiding significant performance advancements in deep learning applications. In this study, we focus on the scaling relationship with data size in the context of medical anatomical segmentation, a domain that remains underexplored. We analyze scaling laws for anatomical segmentation across 15 semantic tasks and 4 imaging modalities, demonstrating that larger datasets significantly improve segmentation performance, following similar scaling trends. Motivated by the topological isomorphism in images sharing anatomical structures, we evaluate the impact of deformation-guided augmentation strategies on data scaling laws, specifically random elastic deformation and registration-guided deformation. We also propose a novel, scalable image augmentation approach that generates diffeomorphic mappings from geodesic subspace based on image registration to introduce realistic deformation. Our experimental results demonstrate that both registered and generated deformation-based augmentation considerably enhance data utilization efficiency. The proposed generated deformation method notably achieves superior performance and accelerated convergence, surpassing standard power law scaling trends without requiring additional data. Overall, this work provides insights into the understanding of segmentation scalability and topological variation impact in medical imaging, thereby leading to more efficient model development with reduced annotation and computational costs.

Paper Structure

This paper contains 8 sections, 6 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Validation of data scaling law in medical segmentation tasks. We use the BCE loss as the predicted error, and employ both Res-UNet and Swin-UNet as the segmentation backbone.
  • Figure 2: The overview of RegDA and GenDA. (a) Workflow of deformation-based augmentation using registration. Deformation fields are generated by registering raw data with an external dataset, sampled from geodesic subspaces, and applied to transform the data and segmentation annotations. (b) Proposed transformation method based on generated deformation: geodesic subspaces are used to conditionally train a GAN to generate deformation fields, which are then applied to raw data and segmentation annotations for augmentation. This approach reduces reliance on external datasets.
  • Figure 3: Data scaling law comparison of different transformation techniques with the Res-UNet segmentation backbone. Comparison of BCE loss across various medical segmentation tasks for three methods: random elastic deformation (RED), deformation-based augmentation using RegDA, and proposed GenDA.
  • Figure 4: Data scaling law comparison of different transformation techniques with the Swin-UNet segmentation backbone. Comparison of BCE loss across various medical segmentation tasks for three methods: random elastic deformation (RED), deformation-based augmentation using RegDA and proposed GenDA.