Table of Contents
Fetching ...

Blaze3DM: Marry Triplane Representation with Diffusion for 3D Medical Inverse Problem Solving

Jia He, Bonan Li, Ge Yang, Ziwen Liu

TL;DR

Blaze3DM tackles the challenge of high-dimensional 3D medical inverse problems by modeling volumes with a compact triplane neural field and learning a diffusion-based generator on latent triplane embeddings. A shared decoder with a 3D-aware module reconstructs volumes from triplane features, while a guided diffusion process produces high-fidelity triplane representations that map to volumes of arbitrary size. Key contributions include the first integration of triplane neural fields for 3D medical volumes, a 3D-aware fusion mechanism to enforce inter-plane consistency, and a guidance-based sampling strategy for zero-shot inverse problem solving. Across SV-CT, LA-CT, CS-MRI, and ZSR-MRI, Blaze3DM achieves state-of-the-art reconstruction quality with substantial speedups (up to 22×–40×) over prior methods, enabling practical, scalable 3D medical imaging with reduced computational load.

Abstract

Solving 3D medical inverse problems such as image restoration and reconstruction is crucial in modern medical field. However, the curse of dimensionality in 3D medical data leads mainstream volume-wise methods to suffer from high resource consumption and challenges models to successfully capture the natural distribution, resulting in inevitable volume inconsistency and artifacts. Some recent works attempt to simplify generation in the latent space but lack the capability to efficiently model intricate image details. To address these limitations, we present Blaze3DM, a novel approach that enables fast and high-fidelity generation by integrating compact triplane neural field and powerful diffusion model. In technique, Blaze3DM begins by optimizing data-dependent triplane embeddings and a shared decoder simultaneously, reconstructing each triplane back to the corresponding 3D volume. To further enhance 3D consistency, we introduce a lightweight 3D aware module to model the correlation of three vertical planes. Then, diffusion model is trained on latent triplane embeddings and achieves both unconditional and conditional triplane generation, which is finally decoded to arbitrary size volume. Extensive experiments on zero-shot 3D medical inverse problem solving, including sparse-view CT, limited-angle CT, compressed-sensing MRI, and MRI isotropic super-resolution, demonstrate that Blaze3DM not only achieves state-of-the-art performance but also markedly improves computational efficiency over existing methods (22~40x faster than previous work).

Blaze3DM: Marry Triplane Representation with Diffusion for 3D Medical Inverse Problem Solving

TL;DR

Blaze3DM tackles the challenge of high-dimensional 3D medical inverse problems by modeling volumes with a compact triplane neural field and learning a diffusion-based generator on latent triplane embeddings. A shared decoder with a 3D-aware module reconstructs volumes from triplane features, while a guided diffusion process produces high-fidelity triplane representations that map to volumes of arbitrary size. Key contributions include the first integration of triplane neural fields for 3D medical volumes, a 3D-aware fusion mechanism to enforce inter-plane consistency, and a guidance-based sampling strategy for zero-shot inverse problem solving. Across SV-CT, LA-CT, CS-MRI, and ZSR-MRI, Blaze3DM achieves state-of-the-art reconstruction quality with substantial speedups (up to 22×–40×) over prior methods, enabling practical, scalable 3D medical imaging with reduced computational load.

Abstract

Solving 3D medical inverse problems such as image restoration and reconstruction is crucial in modern medical field. However, the curse of dimensionality in 3D medical data leads mainstream volume-wise methods to suffer from high resource consumption and challenges models to successfully capture the natural distribution, resulting in inevitable volume inconsistency and artifacts. Some recent works attempt to simplify generation in the latent space but lack the capability to efficiently model intricate image details. To address these limitations, we present Blaze3DM, a novel approach that enables fast and high-fidelity generation by integrating compact triplane neural field and powerful diffusion model. In technique, Blaze3DM begins by optimizing data-dependent triplane embeddings and a shared decoder simultaneously, reconstructing each triplane back to the corresponding 3D volume. To further enhance 3D consistency, we introduce a lightweight 3D aware module to model the correlation of three vertical planes. Then, diffusion model is trained on latent triplane embeddings and achieves both unconditional and conditional triplane generation, which is finally decoded to arbitrary size volume. Extensive experiments on zero-shot 3D medical inverse problem solving, including sparse-view CT, limited-angle CT, compressed-sensing MRI, and MRI isotropic super-resolution, demonstrate that Blaze3DM not only achieves state-of-the-art performance but also markedly improves computational efficiency over existing methods (22~40x faster than previous work).
Paper Structure (35 sections, 16 equations, 9 figures, 4 tables, 1 algorithm)

This paper contains 35 sections, 16 equations, 9 figures, 4 tables, 1 algorithm.

Figures (9)

  • Figure 1: (a) and (b): Volume generation by triplane-based diffusion model of our Blaze3DM on MRI and CT; (c) Blaze3DM speeds up inference time by up to $22\times$ compared to DiffuseMBIR chung2023solving and TPDM lee2023improving while achieving state-of-the-art performance in four tasks.
  • Figure 2: The diagram of Blaze3DM. (a) The top shows the Triplane Decoder Network, which decodes the triplane representation $f$ to the volume intensity. The decoder includes a 3D aware module and a lightweight MLP decoder. (b) The bottom shows the Triplane Diffusion Model, which utilizes the diffusion model to generate triplane embeddings under unconditional/conditional settings.
  • Figure 3: 36-view SV-CT reconstruction results of the test volume of AAPM CT dataset. (First row: axial plane; Second row: coronal plane; Third row: sagittal plane)
  • Figure 4: $90\degree$ LA-CT reconstruction results of the test volume of AAPM CT dataset. (First row: axial plane; Second row: coronal plane; Third row: sagittal plane)
  • Figure 5: Ablation studies of triplane fitting on AAPM-CT dataset, take triplane of L333 as an example to show. (First row: axial plane; Second row: coronal plane; Third row: sagittal plane)
  • ...and 4 more figures