Segmenting Fetal Head with Efficient Fine-tuning Strategies in Low-resource Settings: an empirical study with U-Net
Fangyijie Wang, Guénolé Silvestre, Kathleen M. Curran
TL;DR
This study addresses the challenge of accurately segmenting the fetal head in ultrasound images for head circumference estimation in low-resource settings. It evaluates a U‑Net model with a lightweight MobileNet v2 encoder and a range of fine-tuning strategies, showing that decoder-focused fine-tuning (FT_Decoder) yields superior segmentation with far fewer trainable parameters. Across diverse datasets from high- and low-resource contexts, FT_Decoder demonstrates strong generalization and transferability, outperforming training from scratch and encoder-focused approaches. The work provides practical guidance for efficient fetal head segmentation and releases code and fine-tuned weights to facilitate adoption in resource-constrained environments.
Abstract
Accurate measurement of fetal head circumference is crucial for estimating fetal growth during routine prenatal screening. Prior to measurement, it is necessary to accurately identify and segment the region of interest, specifically the fetal head, in ultrasound images. Recent advancements in deep learning techniques have shown significant progress in segmenting the fetal head using encoder-decoder models. Among these models, U-Net has become a standard approach for accurate segmentation. However, training an encoder-decoder model can be a time-consuming process that demands substantial computational resources. Moreover, fine-tuning these models is particularly challenging when there is a limited amount of data available. There are still no "best-practice" guidelines for optimal fine-tuning of U-net for fetal ultrasound image segmentation. This work summarizes existing fine-tuning strategies with various backbone architectures, model components, and fine-tuning strategies across ultrasound data from Netherlands, Spain, Malawi, Egypt and Algeria. Our study shows that (1) fine-tuning U-Net leads to better performance than training from scratch, (2) fine-tuning strategies in decoder are superior to other strategies, (3) network architecture with less number of parameters can achieve similar or better performance. We also demonstrate the effectiveness of fine-tuning strategies in low-resource settings and further expand our experiments into few-shot learning. Lastly, we publicly released our code and specific fine-tuned weights.
