Table of Contents
Fetching ...

HyperSpace: Hypernetworks for spacing-adaptive image segmentation

Samuel Joutard, Maximilian Pietsch, Raphael Prevost

TL;DR

HyperSpace introduces spacing-adaptive hypernetworks to medical image segmentation, enabling models to process images at native voxel spacing without resampling. By guiding the U-Net parameters with a hypernetwork conditioned on spacing, the method achieves competitive accuracy across three 3D MRI datasets while reducing compute and memory demands. Activation-alignment analyses reveal a structured latent space linking networks across spacings, supporting the approach's robustness and adaptability. The work offers practical deployment benefits, including hardware-aware inference and potential extensions to more flexible architectural components.

Abstract

Medical images are often acquired in different settings, requiring harmonization to adapt to the operating point of algorithms. Specifically, to standardize the physical spacing of imaging voxels in heterogeneous inference settings, images are typically resampled before being processed by deep learning models. However, down-sampling results in loss of information, whereas upsampling introduces redundant information leading to inefficient resource utilization. To overcome these issues, we propose to condition segmentation models on the voxel spacing using hypernetworks. Our approach allows processing images at their native resolutions or at resolutions adjusted to the hardware and time constraints at inference time. Our experiments across multiple datasets demonstrate that our approach achieves competitive performance compared to resolution-specific models, while offering greater flexibility for the end user. This also simplifies model development, deployment and maintenance. Our code is available at https://github.com/ImFusionGmbH/HyperSpace.

HyperSpace: Hypernetworks for spacing-adaptive image segmentation

TL;DR

HyperSpace introduces spacing-adaptive hypernetworks to medical image segmentation, enabling models to process images at native voxel spacing without resampling. By guiding the U-Net parameters with a hypernetwork conditioned on spacing, the method achieves competitive accuracy across three 3D MRI datasets while reducing compute and memory demands. Activation-alignment analyses reveal a structured latent space linking networks across spacings, supporting the approach's robustness and adaptability. The work offers practical deployment benefits, including hardware-aware inference and potential extensions to more flexible architectural components.

Abstract

Medical images are often acquired in different settings, requiring harmonization to adapt to the operating point of algorithms. Specifically, to standardize the physical spacing of imaging voxels in heterogeneous inference settings, images are typically resampled before being processed by deep learning models. However, down-sampling results in loss of information, whereas upsampling introduces redundant information leading to inefficient resource utilization. To overcome these issues, we propose to condition segmentation models on the voxel spacing using hypernetworks. Our approach allows processing images at their native resolutions or at resolutions adjusted to the hardware and time constraints at inference time. Our experiments across multiple datasets demonstrate that our approach achieves competitive performance compared to resolution-specific models, while offering greater flexibility for the end user. This also simplifies model development, deployment and maintenance. Our code is available at https://github.com/ImFusionGmbH/HyperSpace.
Paper Structure (10 sections, 1 equation, 8 figures, 2 tables)

This paper contains 10 sections, 1 equation, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Illustration of the proposed framework. The hyper-network $H_{\beta}$ predicts the primary network's weights $\eta$ from the image spacing. These weights (and biases) are then dispatched to their corresponding layer in the UNet that performs the segmentation.
  • Figure 2: Mean Dice score for all 3 datasets. On the bottom right, the inference runtime and peak GPU memory usage on the MM-WHS dataset are reported. The blue-shaded region correspond to expected resolution range, the pink-shaded region correspond to resolutions not seen during training. The purple vertical line indicates the resolution at which FS and FSNR were trained.
  • Figure 3: CKA analysis across convolution and nonlinearity layers, within resolution-specific U-Net networks (a,b) and across networks of 1mm and a coarser resolution network generated from the same hypernetwork (e,f). Plots (c,g) show the rate of change of CKA with spacing for a linear model CKA s̃pacing for the within- (c) and inter resolution-specific U-Net scores (g). For comparison, (h) shows the CKA for two U-Nets with identical spacing but from different hypernetworks. The distance between layers is shown in the lower triangle of (d) with lines indicating skip connections between U-Net branches and the upper triangle shows areas of identical spatial feature dimensions.
  • Figure 4: Inference runtime and peak GPU memory usage on all 3 datasets.
  • Figure 5: Mean Dice score for all 3 datasets on other resolution segments.
  • ...and 3 more figures