Table of Contents
Fetching ...

Experimental Validation of Ultrasound Beamforming with End-to-End Deep Learning for Single Plane Wave Imaging

Ryan A. L. Schoop, Gijs Hendriks, Tristan van Leeuwen, Chris L. de Korte, Felix Lucka

TL;DR

This work tackles the quality gap in ultrafast plane-wave ultrasound by embedding a differentiable f-k migration image-formation layer into an end-to-end neural network. Using experimental data from breast-mimicking and calibration phantoms, the authors compare a complete data-to-image model against image-only and data-only variants, demonstrating robust improvements in global and local image quality with surprisingly little training data. The complete model delivers smoother, higher-contrast images and better lesion delineation across metrics, though axial resolution improvements are moderate and some artifact generation can occur in certain variants. Overall, the study provides a practical, physics-informed DL framework for high-frame-rate ultrasound with realistic benchmarking data and clear guidance for future enhancements.

Abstract

Ultrafast ultrasound imaging insonifies a medium with one or a combination of a few plane waves at different beam-steered angles instead of many focused waves. It can achieve much higher frame rates, but often at the cost of reduced image quality. Deep learning approaches have been proposed to mitigate this disadvantage, in particular for single plane wave imaging. Predominantly, image-to-image post-processing networks or fully learned data-to-image neural networks are used. Both construct their mapping purely data-driven and require expressive networks and large amounts of training data to perform well. In contrast, we consider data-to-image networks which incorporate a conventional image formation techniques as differentiable layers in the network architecture. This allows for end-to-end training with small amounts of training data. In this work, using f-k migration as an image formation layer is evaluated in-depth with experimental data. We acquired a data collection designed for benchmarking data-driven plane wave imaging approaches using a realistic breast mimicking phantom and an ultrasound calibration phantom. The evaluation considers global and local image similarity measures and contrast, resolution and lesion detectability analysis. The results show that the proposed network architecture is capable of improving the image quality of single plane wave images on all evaluation metrics. Furthermore, these image quality improvements can be achieved with surprisingly little amounts of training data.

Experimental Validation of Ultrasound Beamforming with End-to-End Deep Learning for Single Plane Wave Imaging

TL;DR

This work tackles the quality gap in ultrafast plane-wave ultrasound by embedding a differentiable f-k migration image-formation layer into an end-to-end neural network. Using experimental data from breast-mimicking and calibration phantoms, the authors compare a complete data-to-image model against image-only and data-only variants, demonstrating robust improvements in global and local image quality with surprisingly little training data. The complete model delivers smoother, higher-contrast images and better lesion delineation across metrics, though axial resolution improvements are moderate and some artifact generation can occur in certain variants. Overall, the study provides a practical, physics-informed DL framework for high-frame-rate ultrasound with realistic benchmarking data and clear guidance for future enhancements.

Abstract

Ultrafast ultrasound imaging insonifies a medium with one or a combination of a few plane waves at different beam-steered angles instead of many focused waves. It can achieve much higher frame rates, but often at the cost of reduced image quality. Deep learning approaches have been proposed to mitigate this disadvantage, in particular for single plane wave imaging. Predominantly, image-to-image post-processing networks or fully learned data-to-image neural networks are used. Both construct their mapping purely data-driven and require expressive networks and large amounts of training data to perform well. In contrast, we consider data-to-image networks which incorporate a conventional image formation techniques as differentiable layers in the network architecture. This allows for end-to-end training with small amounts of training data. In this work, using f-k migration as an image formation layer is evaluated in-depth with experimental data. We acquired a data collection designed for benchmarking data-driven plane wave imaging approaches using a realistic breast mimicking phantom and an ultrasound calibration phantom. The evaluation considers global and local image similarity measures and contrast, resolution and lesion detectability analysis. The results show that the proposed network architecture is capable of improving the image quality of single plane wave images on all evaluation metrics. Furthermore, these image quality improvements can be achieved with surprisingly little amounts of training data.
Paper Structure (17 sections, 2 equations, 9 figures)

This paper contains 17 sections, 2 equations, 9 figures.

Figures (9)

  • Figure 1: Input and target image of a test sample: (a) input data directly reconstructed by $f$-$k$ migration; (b) target image reconstructed by $f$-$k$ migration with 75 angles of compounding.
  • Figure 2: Overview of the proposed network architecture for the complete model. Leftmost is the raw channel data which serves as the input for the network. The first part of the network is a 2D residual network (ResNet) which acts as a pre-processing network on the raw channel data. The next step is the $f$-$k$ migration layer and the ultrasound image processing layer. The ultrasound processing layer consists of envelope detection, log compression and clipping. The final part of the network is another 2D ResNet, which acts as a post-processing layer on the image data. Finally the model output is obtained.
  • Figure 3: Graphical display of the ResNet used in the complete model depicted in Fig. \ref{['fig:complete_model']}. The ResNet is built up out of three residual blocks, which can be identified by the positioning of the skip connections. These residual blocks are comprised of 2D convolutional layers with weight standardization (WS) and group normalization (GN). These 2D convolutional layers use a kernel of size $5 \times 5$ with 64 channels per layer. Each layer except the last one uses a Rectified Linear Unit (ReLU) activation function. For the final activation function it holds that the pre-processing ResNet uses Tanh, and the post-processing ResNet uses Sigmoid.
  • Figure 4: Results on the same test sample as figure \ref{['fig:sample_img_n10_input_target']}: (a) output complete model; (b) output post-processing model; (c) output pre-processing model. All networks were trained with 176 training samples.
  • Figure 5: Global image metrics calculated for the sample shown in Figure \ref{['fig:sample_img_n10']} as a function of the amount of training samples used. Shown are mean values $\pm$ standard deviation for the 16 random repetitions of the training: (a) $\ell_{1}$ loss, (b) $\ell_{2}$ loss, (c) peak signal-to-noise ratio (PSNR), (d) normalized cross correlation (NCC).
  • ...and 4 more figures