Table of Contents
Fetching ...

Reliable Multi-modal Medical Image-to-image Translation Independent of Pixel-wise Aligned Data

Langrui Zhou, Guang Li

TL;DR

MITIA addresses the challenge of reliable multi-modal medical image-to-image translation without pixel-wise aligned data by introducing a prior extraction network that combines a coarse-to-fine multi-modal registration (MReg) with misalignment detection (MDet) to harvest pixel-level priors from misaligned data. These priors are integrated as a weighted regularization term (L_Prior) within a cycle-consistent GAN, constraining the solution space and mitigating the risk of erroneous mappings. The framework demonstrates strong performance on both misaligned and well-aligned datasets, outperforming supervised and unsupervised baselines and showing transferability of the prior loss across models. The work highlights the practical potential to improve reliability and fidelity in clinical image translation when perfect alignment is difficult to obtain, with implications for broader adoption of multi-modal imaging pipelines.

Abstract

The current mainstream multi-modal medical image-to-image translation methods face a contradiction. Supervised methods with outstanding performance rely on pixel-wise aligned training data to constrain the model optimization. However, obtaining pixel-wise aligned multi-modal medical image datasets is challenging. Unsupervised methods can be trained without paired data, but their reliability cannot be guaranteed. At present, there is no ideal multi-modal medical image-to-image translation method that can generate reliable translation results without the need for pixel-wise aligned data. This work aims to develop a novel medical image-to-image translation model that is independent of pixel-wise aligned data (MITIA), enabling reliable multi-modal medical image-to-image translation under the condition of misaligned training data. The proposed MITIA model utilizes a prior extraction network composed of a multi-modal medical image registration module and a multi-modal misalignment error detection module to extract pixel-level prior information from training data with misalignment errors to the largest extent. The extracted prior information is then used to construct a regularization term to constrain the optimization of the unsupervised cycle-consistent GAN model, restricting its solution space and thereby improving the performance and reliability of the generator. We trained the MITIA model using six datasets containing different misalignment errors and two well-aligned datasets. Subsequently, we compared the proposed method with six other state-of-the-art image-to-image translation methods. The results of both quantitative analysis and qualitative visual inspection indicate that MITIA achieves superior performance compared to the competing state-of-the-art methods, both on misaligned data and aligned data.

Reliable Multi-modal Medical Image-to-image Translation Independent of Pixel-wise Aligned Data

TL;DR

MITIA addresses the challenge of reliable multi-modal medical image-to-image translation without pixel-wise aligned data by introducing a prior extraction network that combines a coarse-to-fine multi-modal registration (MReg) with misalignment detection (MDet) to harvest pixel-level priors from misaligned data. These priors are integrated as a weighted regularization term (L_Prior) within a cycle-consistent GAN, constraining the solution space and mitigating the risk of erroneous mappings. The framework demonstrates strong performance on both misaligned and well-aligned datasets, outperforming supervised and unsupervised baselines and showing transferability of the prior loss across models. The work highlights the practical potential to improve reliability and fidelity in clinical image translation when perfect alignment is difficult to obtain, with implications for broader adoption of multi-modal imaging pipelines.

Abstract

The current mainstream multi-modal medical image-to-image translation methods face a contradiction. Supervised methods with outstanding performance rely on pixel-wise aligned training data to constrain the model optimization. However, obtaining pixel-wise aligned multi-modal medical image datasets is challenging. Unsupervised methods can be trained without paired data, but their reliability cannot be guaranteed. At present, there is no ideal multi-modal medical image-to-image translation method that can generate reliable translation results without the need for pixel-wise aligned data. This work aims to develop a novel medical image-to-image translation model that is independent of pixel-wise aligned data (MITIA), enabling reliable multi-modal medical image-to-image translation under the condition of misaligned training data. The proposed MITIA model utilizes a prior extraction network composed of a multi-modal medical image registration module and a multi-modal misalignment error detection module to extract pixel-level prior information from training data with misalignment errors to the largest extent. The extracted prior information is then used to construct a regularization term to constrain the optimization of the unsupervised cycle-consistent GAN model, restricting its solution space and thereby improving the performance and reliability of the generator. We trained the MITIA model using six datasets containing different misalignment errors and two well-aligned datasets. Subsequently, we compared the proposed method with six other state-of-the-art image-to-image translation methods. The results of both quantitative analysis and qualitative visual inspection indicate that MITIA achieves superior performance compared to the competing state-of-the-art methods, both on misaligned data and aligned data.
Paper Structure (20 sections, 14 equations, 13 figures, 6 tables)

This paper contains 20 sections, 14 equations, 13 figures, 6 tables.

Figures (13)

  • Figure 1: (a) Unsupervised cycle-consistent methods may produce multiple solutions. (b) We want to utilize the abundant pixel-level prior information in the training data to construct a regularization term to constrain the model optimization, aiming to exclude erroneous mappings as much as possible.
  • Figure 2: (a) T1 and T2 MR images of the same brain slice with affine deformation. (b) CT image and digital pathological image after H&E staining of the same human cheek tissue sample.
  • Figure 3: A general overview of MITIA. MITIA consists of three modules MDet, MReg, and Cycle. MReg is a multi-modal registration module. MDet is a multi-modal misalignment error detection module. Cycle is a cycle-consistent GAN-based image-to-image translation module.
  • Figure 4: (a) Domain generalization method for simulating multiple different modalities. (b) Training process of the multi-modal misalignment error detector $D$.
  • Figure 5: Convert the error map output by $D$ into a confidence matrix $W$ using the activation function $Act$.
  • ...and 8 more figures