Table of Contents
Fetching ...

Geometric Transformation Uncertainty for Improving 3D Fetal Brain Pose Prediction from Freehand 2D Ultrasound Videos

Jayroop Ramesh, Nicola K Dinsdale, the INTERGROWTH-21st Consortium, Pak-Hei Yeung, Ana IL Namburete

TL;DR

This work tackles the problem of localizing 2D fetal brain ultrasound planes within a 3D atlas under limited resources. It introduces QAERTS, an uncertainty-aware multi-head network that regresses 3D pose via multiple geometric transformations and jointly predicts per-head variances, optimized with a heteroscedastic Gaussian negative log-likelihood $\mathcal{L}_{GNLL}$. Empirically, QAERTS achieves strong pose localization and image quality (notably 9% PA and 8% NCC improvements) while using roughly 5× fewer parameters than ensemble baselines, and demonstrates robustness to noise in freehand scanning. The method holds practical value for equitable obstetric care in LMIC settings by improving reliability of automated US analysis with limited computational resources.

Abstract

Accurately localizing two-dimensional (2D) ultrasound (US) fetal brain images in the 3D brain, using minimal computational resources, is an important task for automated US analysis of fetal growth and development. We propose an uncertainty-aware deep learning model for automated 3D plane localization in 2D fetal brain images. Specifically, a multi-head network is trained to jointly regress 3D plane pose from 2D images in terms of different geometric transformations. The model explicitly learns to predict uncertainty to allocate higher weight to inputs with low variances across different transformations to improve performance. Our proposed method, QAERTS, demonstrates superior pose estimation accuracy than the state-of-the-art and most of the uncertainty-based approaches, leading to 9% improvement on plane angle (PA) for localization accuracy, and 8% on normalized cross-correlation (NCC) for sampled image quality. QAERTS also demonstrates efficiency, containing 5$\times$ fewer parameters than ensemble-based approach, making it advantageous in resource-constrained settings. In addition, QAERTS proves to be more robust to noise effects observed in freehand US scanning by leveraging rotational discontinuities and explicit output uncertainties.

Geometric Transformation Uncertainty for Improving 3D Fetal Brain Pose Prediction from Freehand 2D Ultrasound Videos

TL;DR

This work tackles the problem of localizing 2D fetal brain ultrasound planes within a 3D atlas under limited resources. It introduces QAERTS, an uncertainty-aware multi-head network that regresses 3D pose via multiple geometric transformations and jointly predicts per-head variances, optimized with a heteroscedastic Gaussian negative log-likelihood . Empirically, QAERTS achieves strong pose localization and image quality (notably 9% PA and 8% NCC improvements) while using roughly 5× fewer parameters than ensemble baselines, and demonstrates robustness to noise in freehand scanning. The method holds practical value for equitable obstetric care in LMIC settings by improving reliability of automated US analysis with limited computational resources.

Abstract

Accurately localizing two-dimensional (2D) ultrasound (US) fetal brain images in the 3D brain, using minimal computational resources, is an important task for automated US analysis of fetal growth and development. We propose an uncertainty-aware deep learning model for automated 3D plane localization in 2D fetal brain images. Specifically, a multi-head network is trained to jointly regress 3D plane pose from 2D images in terms of different geometric transformations. The model explicitly learns to predict uncertainty to allocate higher weight to inputs with low variances across different transformations to improve performance. Our proposed method, QAERTS, demonstrates superior pose estimation accuracy than the state-of-the-art and most of the uncertainty-based approaches, leading to 9% improvement on plane angle (PA) for localization accuracy, and 8% on normalized cross-correlation (NCC) for sampled image quality. QAERTS also demonstrates efficiency, containing 5 fewer parameters than ensemble-based approach, making it advantageous in resource-constrained settings. In addition, QAERTS proves to be more robust to noise effects observed in freehand US scanning by leveraging rotational discontinuities and explicit output uncertainties.
Paper Structure (7 sections, 1 equation, 3 figures, 1 table)

This paper contains 7 sections, 1 equation, 3 figures, 1 table.

Figures (3)

  • Figure 1: Pipeline of our proposed work. During training, 2D slices sampled from aligned 3D volumes are augmented and used to train our proposed uncertainty-aware multi-head model with diverse parameterizations. The feature extractor (cyan) is composed of ten pairs of consecutive 2D convolutional blocks with instance normalization and rectified linear unit (ReLU) activations followed by a maxpooling operation. The generated $\bm{z}$ embedding is flattened with a adaptive pooling operation, and is propagated through two fully connected layers with a ReLU activation after each layer to a multi-head predictor (black). The trained network can be used to regress the parameters of five different geometric transformations (pink) and their resulting coordinate-wise predictive variances (blue) obtained through a set of independent fully connected layers with no activations along with shared translation and scaling parameters (yellow). The averaged 3D poses and variance obtained from an arbitrary number of 2D images is then used to compute the loss function (green).
  • Figure 2: Examples from the test set for "high quality" and "low quality" predictions are provided in a)-b) and c)-d) respectively. Input frames (yellow) to the models extracted from 3D volumes are shown in the first column of a) and c). First column of a) and c) show ground-truth planes (green), and predicted (red) are visualized in 3D atlas space along the second to fifth columns of a) and c). Slices sampled (indicated by frame color) from the 3D atlas using the predicted and ground-truth plane locations for each model are shown along second to fifth columns of b) and d).
  • Figure 3: Examples from freehand 2D US videos across three patients. The first column in a), b) and c) shows a frame from each video that were input to each model. Second to fourth column in a), b) and c) show corresponding slices sampled from the 3D atlas, using the predicted plane locations. The pink, yellow and cyan arrows indicate the anatomical structures of lateral ventricles, choroid plexus and Sylvian fissure.