Table of Contents
Fetching ...

SSDD-GAN: Single-Step Denoising Diffusion GAN for Cochlear Implant Surgical Scene Completion

Yike Zhang, Eduardo Davalos, Jack Noble

TL;DR

This work tackles surgical scene completion for cochlear implant procedures by predicting complete microscopic views from partial data. The authors introduce SSDD-GAN, a self-supervised single-step denoising diffusion-GAN that combines diffusion-based generation with adversarial refinement to produce high-fidelity, semantically coherent reconstructions, trained on real surgical data and applied zero-shot to a synthetic postmastoidectomy dataset. Results show superior performance across multiple metrics compared with existing methods and demonstrate robustness across varying mask sizes, enabling realistic synthetic surgical scenes with accurate camera poses. By providing full surgical field visualization and navigation capability, the approach holds potential to improve preoperative planning, intraoperative guidance, and tool tracking in image-guided cochlear implant surgery.

Abstract

Recent deep learning-based image completion methods, including both inpainting and outpainting, have demonstrated promising results in restoring corrupted images by effectively filling various missing regions. Among these, Generative Adversarial Networks (GANs) and Denoising Diffusion Probabilistic Models (DDPMs) have been employed as key generative image completion approaches, excelling in the field of generating high-quality restorations with reduced artifacts and improved fine details. In previous work, we developed a method aimed at synthesizing views from novel microscope positions for mastoidectomy surgeries; however, that approach did not have the ability to restore the surrounding surgical scene environment. In this paper, we propose an efficient method to complete the surgical scene of the synthetic postmastoidectomy dataset. Our approach leverages self-supervised learning on real surgical datasets to train a Single-Step Denoising Diffusion-GAN (SSDD-GAN), combining the advantages of diffusion models with the adversarial optimization of GANs for improved Structural Similarity results of 6%. The trained model is then directly applied to the synthetic postmastoidectomy dataset using a zero-shot approach, enabling the generation of realistic and complete surgical scenes without the need for explicit ground-truth labels from the synthetic postmastoidectomy dataset. This method addresses key limitations in previous work, offering a novel pathway for full surgical microscopy scene completion and enhancing the usability of the synthetic postmastoidectomy dataset in surgical preoperative planning and intraoperative navigation.

SSDD-GAN: Single-Step Denoising Diffusion GAN for Cochlear Implant Surgical Scene Completion

TL;DR

This work tackles surgical scene completion for cochlear implant procedures by predicting complete microscopic views from partial data. The authors introduce SSDD-GAN, a self-supervised single-step denoising diffusion-GAN that combines diffusion-based generation with adversarial refinement to produce high-fidelity, semantically coherent reconstructions, trained on real surgical data and applied zero-shot to a synthetic postmastoidectomy dataset. Results show superior performance across multiple metrics compared with existing methods and demonstrate robustness across varying mask sizes, enabling realistic synthetic surgical scenes with accurate camera poses. By providing full surgical field visualization and navigation capability, the approach holds potential to improve preoperative planning, intraoperative guidance, and tool tracking in image-guided cochlear implant surgery.

Abstract

Recent deep learning-based image completion methods, including both inpainting and outpainting, have demonstrated promising results in restoring corrupted images by effectively filling various missing regions. Among these, Generative Adversarial Networks (GANs) and Denoising Diffusion Probabilistic Models (DDPMs) have been employed as key generative image completion approaches, excelling in the field of generating high-quality restorations with reduced artifacts and improved fine details. In previous work, we developed a method aimed at synthesizing views from novel microscope positions for mastoidectomy surgeries; however, that approach did not have the ability to restore the surrounding surgical scene environment. In this paper, we propose an efficient method to complete the surgical scene of the synthetic postmastoidectomy dataset. Our approach leverages self-supervised learning on real surgical datasets to train a Single-Step Denoising Diffusion-GAN (SSDD-GAN), combining the advantages of diffusion models with the adversarial optimization of GANs for improved Structural Similarity results of 6%. The trained model is then directly applied to the synthetic postmastoidectomy dataset using a zero-shot approach, enabling the generation of realistic and complete surgical scenes without the need for explicit ground-truth labels from the synthetic postmastoidectomy dataset. This method addresses key limitations in previous work, offering a novel pathway for full surgical microscopy scene completion and enhancing the usability of the synthetic postmastoidectomy dataset in surgical preoperative planning and intraoperative navigation.

Paper Structure

This paper contains 6 sections, 4 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Forward Diffusion Process. We preserve the masked region of the original sample data while applying Gaussian noise exclusively to the non-masked region.
  • Figure 2: Single-Step Denoising Diffusion Process. We incorporate a discriminator in this process to further improve the realism of synthetic samples.
  • Figure 3: Performance Comparisons. The experiments evaluate overall performance (top row) as well as performance across varying mask ratios (bottom row).
  • Figure 4: Qualitative Comparisons. Visualizations of completing missing regions using various methods. Certain details are highlighted in cyan bounding boxes.
  • Figure 5: Ablation Study. Analyzing the impact of varying the number of $T$.
  • ...and 3 more figures