Table of Contents
Fetching ...

DeNVeR: Deformable Neural Vessel Representations for Unsupervised Video Vessel Segmentation

Chun-Hung Wu, Shih-Hong Chen, Chih-Yao Hu, Hsin-Yu Wu, Kai-Hsin Chen, Yu-You Chen, Chih-Hai Su, Chih-Kuo Lee, Yu-Lun Liu

TL;DR

DeNVeR addresses unsupervised vessel segmentation in X-ray videos by combining Hessian-based preprocessing, layer separation bootstrapping, and test-time optimization that leverages optical flow and Eulerian motion fields to model vessel dynamics. The two-stage approach first learns a background canonical image and then refines a foreground vessel representation with a fixed latent code, guided by a set of losses that enforce temporal coherence and faithful reconstruction. A new XACV dataset with high-quality ground truth enables rigorous evaluation, where DeNVeR outperforms state-of-the-art self-supervised methods and demonstrates strong generalization to the CADICA dataset, despite lacking annotations. The work highlights the practicality of unsupervised, test-time vessel segmentation in clinical workflows, offering accurate, temporally coherent delineations without manual labels, though it entails computation time and preprocessing sensitivity. Overall, DeNVeR advances unsupervised video vessel segmentation by integrating implicit representations, motion modeling, and robust losses tailored to X-ray angiography dynamics.

Abstract

This paper presents Deformable Neural Vessel Representations (DeNVeR), an unsupervised approach for vessel segmentation in X-ray angiography videos without annotated ground truth. DeNVeR utilizes optical flow and layer separation techniques, enhancing segmentation accuracy and adaptability through test-time training. Key contributions include a novel layer separation bootstrapping technique, a parallel vessel motion loss, and the integration of Eulerian motion fields for modeling complex vessel dynamics. A significant component of this research is the introduction of the XACV dataset, the first X-ray angiography coronary video dataset with high-quality, manually labeled segmentation ground truth. Extensive evaluations on both XACV and CADICA datasets demonstrate that DeNVeR outperforms current state-of-the-art methods in vessel segmentation accuracy and generalization capability while maintaining temporal coherency.

DeNVeR: Deformable Neural Vessel Representations for Unsupervised Video Vessel Segmentation

TL;DR

DeNVeR addresses unsupervised vessel segmentation in X-ray videos by combining Hessian-based preprocessing, layer separation bootstrapping, and test-time optimization that leverages optical flow and Eulerian motion fields to model vessel dynamics. The two-stage approach first learns a background canonical image and then refines a foreground vessel representation with a fixed latent code, guided by a set of losses that enforce temporal coherence and faithful reconstruction. A new XACV dataset with high-quality ground truth enables rigorous evaluation, where DeNVeR outperforms state-of-the-art self-supervised methods and demonstrates strong generalization to the CADICA dataset, despite lacking annotations. The work highlights the practicality of unsupervised, test-time vessel segmentation in clinical workflows, offering accurate, temporally coherent delineations without manual labels, though it entails computation time and preprocessing sensitivity. Overall, DeNVeR advances unsupervised video vessel segmentation by integrating implicit representations, motion modeling, and robust losses tailored to X-ray angiography dynamics.

Abstract

This paper presents Deformable Neural Vessel Representations (DeNVeR), an unsupervised approach for vessel segmentation in X-ray angiography videos without annotated ground truth. DeNVeR utilizes optical flow and layer separation techniques, enhancing segmentation accuracy and adaptability through test-time training. Key contributions include a novel layer separation bootstrapping technique, a parallel vessel motion loss, and the integration of Eulerian motion fields for modeling complex vessel dynamics. A significant component of this research is the introduction of the XACV dataset, the first X-ray angiography coronary video dataset with high-quality, manually labeled segmentation ground truth. Extensive evaluations on both XACV and CADICA datasets demonstrate that DeNVeR outperforms current state-of-the-art methods in vessel segmentation accuracy and generalization capability while maintaining temporal coherency.
Paper Structure (22 sections, 7 equations, 15 figures, 2 tables)

This paper contains 22 sections, 7 equations, 15 figures, 2 tables.

Figures (15)

  • Figure 1: Vessel segmentation method comparison. Unlike SSVS ma2021self, DARL kim2022diffusion, and FreeCOS shi2023freecos, which require extensive training data, which limits their ability to generalize to new data, our method uses unsupervised test-time training on testing videos. This approach achieves superior accuracy with finer, more consistent vessel contours, demonstrating robust generalization with minimal data.
  • Figure 2: Pipeline for unsupervised vessel segmentation from X-ray videos. (a) Preprocessing: Hessian-based technique with region growing for initial segmentation. (b) Stage 1: MLPs model background deformation and canonical image using bootstrapping loss. (c) Stage 2: Refine foreground vessel image, masks, and motions using B-spline parameters and warping. Reconstruction loss ensures fidelity to input frames. The pipeline trains directly on test videos without ground truth masks.
  • Figure 3: Eulerian motion field modeling. Background heartbeat uses a low-degree B-spline; foreground vessel flow uses a stationary Eulerian field. Final vessel flow combines warped Eulerian motion with background flow, capturing both factors observed in X-ray videos.
  • Figure 4: Parallel vessel motion loss. Aligns flow direction with vessel mask direction. Uses skeletonization and distance transform to determine gradient directions. The predicted vessel motion should be perpendicular to these gradients (blue arrows).
  • Figure 5: Comparisons between XCAD ma2021self, CADICA jimenez2024cadica, and our XACV dataset. (a) The images from XCAD with their corresponding GTs. (b) The CADICA dataset provides video frames but without corresponding ground truth. (c) Our XACV dataset with GTs was meticulously labeled by experienced radiologists. Our dataset not only provides GTs with greater accuracy and detail, which is evident in the more nuanced vessel delineations, but also features frames of superior quality, facilitating finer and more precise segmentation results.
  • ...and 10 more figures