Table of Contents
Fetching ...

$TrIND$: Representing Anatomical Trees by Denoising Diffusion of Implicit Neural Fields

Ashish Sinha, Ghassan Hamarneh

TL;DR

This work tackles the challenge of representing complex anatomical trees with varying topology and geometry by introducing TrIND, a two-stage framework that first learns per-sample implicit neural representations (INRs) of trees and then learns a distribution over these INR weights via a transformer-based diffusion model. The approach enables high-fidelity, arbitrary-resolution reconstructions with compact storage and supports synthesis of novel, plausible tree structures across vascular and airway domains. Key contributions include accurate INR-based representations for segmentation, diffusion in INR space for tree generation, and demonstrated versatility across modalities and anatomical sites, along with quantitative validation of compression and reconstruction performance. The method holds potential for integration into clinical imaging pipelines and downstream tasks such as CFD and surgical planning, offering scalable, resolution-agnostic representations of complex tree topologies.

Abstract

Anatomical trees play a central role in clinical diagnosis and treatment planning. However, accurately representing anatomical trees is challenging due to their varying and complex topology and geometry. Traditional methods for representing tree structures, captured using medical imaging, while invaluable for visualizing vascular and bronchial networks, exhibit drawbacks in terms of limited resolution, flexibility, and efficiency. Recently, implicit neural representations (INRs) have emerged as a powerful tool for representing shapes accurately and efficiently. We propose a novel approach, $TrIND$, for representing anatomical trees using INR, while also capturing the distribution of a set of trees via denoising diffusion in the space of INRs. We accurately capture the intricate geometries and topologies of anatomical trees at any desired resolution. Through extensive qualitative and quantitative evaluation, we demonstrate high-fidelity tree reconstruction with arbitrary resolution yet compact storage, and versatility across anatomical sites and tree complexities. The code is available at: \texttt{\url{https://github.com/sinashish/TreeDiffusion}}.

$TrIND$: Representing Anatomical Trees by Denoising Diffusion of Implicit Neural Fields

TL;DR

This work tackles the challenge of representing complex anatomical trees with varying topology and geometry by introducing TrIND, a two-stage framework that first learns per-sample implicit neural representations (INRs) of trees and then learns a distribution over these INR weights via a transformer-based diffusion model. The approach enables high-fidelity, arbitrary-resolution reconstructions with compact storage and supports synthesis of novel, plausible tree structures across vascular and airway domains. Key contributions include accurate INR-based representations for segmentation, diffusion in INR space for tree generation, and demonstrated versatility across modalities and anatomical sites, along with quantitative validation of compression and reconstruction performance. The method holds potential for integration into clinical imaging pipelines and downstream tasks such as CFD and surgical planning, offering scalable, resolution-agnostic representations of complex tree topologies.

Abstract

Anatomical trees play a central role in clinical diagnosis and treatment planning. However, accurately representing anatomical trees is challenging due to their varying and complex topology and geometry. Traditional methods for representing tree structures, captured using medical imaging, while invaluable for visualizing vascular and bronchial networks, exhibit drawbacks in terms of limited resolution, flexibility, and efficiency. Recently, implicit neural representations (INRs) have emerged as a powerful tool for representing shapes accurately and efficiently. We propose a novel approach, , for representing anatomical trees using INR, while also capturing the distribution of a set of trees via denoising diffusion in the space of INRs. We accurately capture the intricate geometries and topologies of anatomical trees at any desired resolution. Through extensive qualitative and quantitative evaluation, we demonstrate high-fidelity tree reconstruction with arbitrary resolution yet compact storage, and versatility across anatomical sites and tree complexities. The code is available at: \texttt{\url{https://github.com/sinashish/TreeDiffusion}}.
Paper Structure (21 sections, 2 equations, 8 figures, 3 tables)

This paper contains 21 sections, 2 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: TrIND overview: (a) An INR is optimized for each sample in the training dataset, and then flattened to a 1D vector. (b) During inference, the INR is recovered from the flattened vector, followed by MC to extract the underlying signal. (c) The diffusion transformer takes the flattened vectors as input to model the diffusion process. After training, novel INRs can be sampled and used for downstream tasks via (b). (d) Novel tree structures visualized as mesh.
  • Figure 2: Pipeline: Top: Given a 3D mesh, sampled points and GT occupancies (inside/outside), an INR $(\theta)$ is optimized to fit to the shape. Bottom: Optimized INRs are flattened to a 1D vector, fed to the transformer-based diffusion model $\mathcal{D}(\phi)$ and optimized to predict noise. The novel INRs can then be sampled from the trained $\mathcal{D}(\phi)$ in the reverse process.
  • Figure 3: Compression vs Reconstruction Accuracy: (a) We decimate/downscale the mesh/volume to occupy $\approx$ same memory space as INR, and report the reconstruction error using Chamfer distance (CD) and edge length loss (Edge) for meshes and INRs, and $L_1$ and $L_2$ for volumes. Note the higher error for meshes and volumes w.r.t INRs for the same storage. (b) Illustration of GT and downsampled volumes. Notice the disconnected components.
  • Figure 4: Versatility: Visualization of different synthetic and real anatomical trees represented as an INR from various medical imaging modalities and organs. We normalize all shapes to $[-1,1]$ and images to $[0,1]$ to report the reconstruction error using MSE and CD ($\times 10^{-3}$) between ground truth and the underlying signal extracted from INR.
  • Figure 5: Arbitrary Resolution: (a) Comparison of 2x, 4x, and 8x zoom on an IntRA sample represented as a volumetric grid (top) and INR (bottom). The resolution for each sub-figure is shown on top as volume$^3$/INR$^3$. Notice the smoothness of the surface even at 8x zoom. (b) Zoomed-in regions of a VascuSynth mesh reconstructed from INRs and ground truth at different mesh resolutions displayed using faces and edges.
  • ...and 3 more figures