Diffusion Adversarial Representation Learning for Self-supervised Vessel Segmentation
Boah Kim, Yujin Oh, Jong Chul Ye
TL;DR
This work tackles vessel segmentation without labeled data by introducing DARL, a non-iterative diffusion-adversarial framework that jointly learns background signals via a diffusion module and vessel representations via a generation module equipped with switchable SPADE layers. The model uses diffusion loss to model backgrounds, adversarial losses to produce realistic vessel masks and angiograms, and a cycle-consistency loss to enforce semantic alignment with fractal vessel masks, enabling robust, one-step segmentation. DARL demonstrates state-of-the-art performance among unsupervised/self-supervised methods across coronary angiography and cross-domain retinal datasets, with strong noise robustness and generalization to unseen imaging modalities. The approach offers fast inference, improved vessel localization, and a reusable framework for general vascular segmentation without heavy labeling requirements.
Abstract
Vessel segmentation in medical images is one of the important tasks in the diagnosis of vascular diseases and therapy planning. Although learning-based segmentation approaches have been extensively studied, a large amount of ground-truth labels are required in supervised methods and confusing background structures make neural networks hard to segment vessels in an unsupervised manner. To address this, here we introduce a novel diffusion adversarial representation learning (DARL) model that leverages a denoising diffusion probabilistic model with adversarial learning, and apply it to vessel segmentation. In particular, for self-supervised vessel segmentation, DARL learns the background signal using a diffusion module, which lets a generation module effectively provide vessel representations. Also, by adversarial learning based on the proposed switchable spatially-adaptive denormalization, our model estimates synthetic fake vessel images as well as vessel segmentation masks, which further makes the model capture vessel-relevant semantic information. Once the proposed model is trained, the model generates segmentation masks in a single step and can be applied to general vascular structure segmentation of coronary angiography and retinal images. Experimental results on various datasets show that our method significantly outperforms existing unsupervised and self-supervised vessel segmentation methods.
