Table of Contents
Fetching ...

DDU-Net: A Domain Decomposition-Based CNN for High-Resolution Image Segmentation on Multiple GPUs

Corné Verburg, Alexander Heinlein, Eric C. Cyr

TL;DR

The model provides an effective solution for segmenting ultra-high-resolution images while preserving spatial context and achieves a 2-3% higher intersection over union (IoU) score compared to the same network without inter-patch communication.

Abstract

The segmentation of ultra-high resolution images poses challenges such as loss of spatial information or computational inefficiency. In this work, a novel approach that combines encoder-decoder architectures with domain decomposition strategies to address these challenges is proposed. Specifically, a domain decomposition-based U-Net (DDU-Net) architecture is introduced, which partitions input images into non-overlapping patches that can be processed independently on separate devices. A communication network is added to facilitate inter-patch information exchange to enhance the understanding of spatial context. Experimental validation is performed on a synthetic dataset that is designed to measure the effectiveness of the communication network. Then, the performance is tested on the DeepGlobe land cover classification dataset as a real-world benchmark data set. The results demonstrate that the approach, which includes inter-patch communication for images divided into $16\times16$ non-overlapping subimages, achieves a $2-3\,\%$ higher intersection over union (IoU) score compared to the same network without inter-patch communication. The performance of the network which includes communication is equivalent to that of a baseline U-Net trained on the full image, showing that our model provides an effective solution for segmenting ultra-high-resolution images while preserving spatial context. The code is available at https://github.com/corne00/DDU-Net.

DDU-Net: A Domain Decomposition-Based CNN for High-Resolution Image Segmentation on Multiple GPUs

TL;DR

The model provides an effective solution for segmenting ultra-high-resolution images while preserving spatial context and achieves a 2-3% higher intersection over union (IoU) score compared to the same network without inter-patch communication.

Abstract

The segmentation of ultra-high resolution images poses challenges such as loss of spatial information or computational inefficiency. In this work, a novel approach that combines encoder-decoder architectures with domain decomposition strategies to address these challenges is proposed. Specifically, a domain decomposition-based U-Net (DDU-Net) architecture is introduced, which partitions input images into non-overlapping patches that can be processed independently on separate devices. A communication network is added to facilitate inter-patch information exchange to enhance the understanding of spatial context. Experimental validation is performed on a synthetic dataset that is designed to measure the effectiveness of the communication network. Then, the performance is tested on the DeepGlobe land cover classification dataset as a real-world benchmark data set. The results demonstrate that the approach, which includes inter-patch communication for images divided into non-overlapping subimages, achieves a higher intersection over union (IoU) score compared to the same network without inter-patch communication. The performance of the network which includes communication is equivalent to that of a baseline U-Net trained on the full image, showing that our model provides an effective solution for segmenting ultra-high-resolution images while preserving spatial context. The code is available at https://github.com/corne00/DDU-Net.
Paper Structure (32 sections, 3 equations, 15 figures, 10 tables)

This paper contains 32 sections, 3 equations, 15 figures, 10 tables.

Figures (15)

  • Figure 1: U-Net architecture for 32$\times$32 pixel input images and corresponding masks. Each blue block represents a multi-channel feature map, with resolutions indicated at the lower left edge of each box. White boxes show copied feature maps from the skip connections (gray arrows). The colored arrows denote different operations. This figure is based on the architecture described in ronneberger2015u. Image is best viewed online.
  • Figure 2: Schematic of the proposed network architecture. Input images are partitioned into subimages that are processed independently in the encoder paths. After encoding, a number of encoded feature maps is communicated to the device containing the communication network and then processed via the communication network. The output of this network replaces the input feature maps. The decoding is also done in parallel without communication between the computational devices. Dashed arrows indicate skip connections. Detailed architectures of the encoder-decoder network and communication network are shown in \ref{['fig:proposed_subnetwork', 'fig:proposed_communication_network']}, respectively.
  • Figure 3: The proposed encoder-decoder architecture. The architecture of the encoder-decoder is nearly identical to the architecture of U-Net ronneberger2015u. The only difference is located in the latent space vector, where a number of the feature maps are modified by the communication network. For optimal detail resolution, view this figure on a digital device.
  • Figure 4: The proposed communication network for four subimages.
  • Figure 5: Two example images (left) and masks (right) from the synthetic dataset. The subimage boundaries used for these images are shown by the red vertical lines.
  • ...and 10 more figures