Table of Contents
Fetching ...

Fully Differentiable Bidirectional Dual-Task Synergistic Learning for Semi-Supervised 3D Medical Image Segmentation

Jun Li

TL;DR

DBiSL introduces a fully differentiable bidirectional synergistic learning framework that jointly optimizes segmentation and distance regression for semi-supervised 3D medical image segmentation. A differentiable bidirectional task transformer enables online gradient-flow between tasks, allowing cross-task supervision, bidirectional consistency, pseudo-labeling, and uncertainty estimation to be integrated within a single framework. The approach achieves state-of-the-art results on LA, Pancreas-CT, and BraTS2019 benchmarks and demonstrates robustness across label ratios, efficient GPU-based computation, and backbone flexibility. This work provides a unified, generalizable blueprint for dual-task SSL and broad multi-task vision applications in medical imaging.

Abstract

Semi-supervised learning relaxes the need of large pixel-wise labeled datasets for image segmentation by leveraging unlabeled data. The scarcity of high-quality labeled data remains a major challenge in medical image analysis due to the high annotation costs and the need for specialized clinical expertise. Semi-supervised learning has demonstrated significant potential in addressing this bottleneck, with pseudo-labeling and consistency regularization emerging as two predominant paradigms. Dual-task collaborative learning, an emerging consistency-aware paradigm, seeks to derive supplementary supervision by establishing prediction consistency between related tasks. However, current methodologies are limited to unidirectional interaction mechanisms (typically regression-to-segmentation), as segmentation results can only be transformed into regression outputs in an offline manner, thereby failing to fully exploit the potential benefits of online bidirectional cross-task collaboration. Thus, we propose a fully Differentiable Bidirectional Synergistic Learning (DBiSL) framework, which seamlessly integrates and enhances four critical SSL components: supervised learning, consistency regularization, pseudo-supervised learning, and uncertainty estimation. Experiments on two benchmark datasets demonstrate our method's state-of-the-art performance. Beyond technical contributions, this work provides new insights into unified SSL framework design and establishes a new architectural foundation for dual-task-driven SSL, while offering a generic multitask learning framework applicable to broader computer vision applications. The code will be released on github upon acceptance.

Fully Differentiable Bidirectional Dual-Task Synergistic Learning for Semi-Supervised 3D Medical Image Segmentation

TL;DR

DBiSL introduces a fully differentiable bidirectional synergistic learning framework that jointly optimizes segmentation and distance regression for semi-supervised 3D medical image segmentation. A differentiable bidirectional task transformer enables online gradient-flow between tasks, allowing cross-task supervision, bidirectional consistency, pseudo-labeling, and uncertainty estimation to be integrated within a single framework. The approach achieves state-of-the-art results on LA, Pancreas-CT, and BraTS2019 benchmarks and demonstrates robustness across label ratios, efficient GPU-based computation, and backbone flexibility. This work provides a unified, generalizable blueprint for dual-task SSL and broad multi-task vision applications in medical imaging.

Abstract

Semi-supervised learning relaxes the need of large pixel-wise labeled datasets for image segmentation by leveraging unlabeled data. The scarcity of high-quality labeled data remains a major challenge in medical image analysis due to the high annotation costs and the need for specialized clinical expertise. Semi-supervised learning has demonstrated significant potential in addressing this bottleneck, with pseudo-labeling and consistency regularization emerging as two predominant paradigms. Dual-task collaborative learning, an emerging consistency-aware paradigm, seeks to derive supplementary supervision by establishing prediction consistency between related tasks. However, current methodologies are limited to unidirectional interaction mechanisms (typically regression-to-segmentation), as segmentation results can only be transformed into regression outputs in an offline manner, thereby failing to fully exploit the potential benefits of online bidirectional cross-task collaboration. Thus, we propose a fully Differentiable Bidirectional Synergistic Learning (DBiSL) framework, which seamlessly integrates and enhances four critical SSL components: supervised learning, consistency regularization, pseudo-supervised learning, and uncertainty estimation. Experiments on two benchmark datasets demonstrate our method's state-of-the-art performance. Beyond technical contributions, this work provides new insights into unified SSL framework design and establishes a new architectural foundation for dual-task-driven SSL, while offering a generic multitask learning framework applicable to broader computer vision applications. The code will be released on github upon acceptance.
Paper Structure (38 sections, 15 equations, 7 figures, 13 tables, 1 algorithm)

This paper contains 38 sections, 15 equations, 7 figures, 13 tables, 1 algorithm.

Figures (7)

  • Figure 1: Different dual-task structures. (a) Unidirectional: Information flows one-way, from one task to another only. (b) Bidirectional (Proposed): Bidirectional task interaction, information can flow seamlessly between two tasks while preserving gradient continuity.
  • Figure 2: Overview of the DBiSL framework. A shared encoder with segmentation and distance-regression heads is coupled by our fully differentiable bidirectional task transformer, enabling online dual-task interaction and supporting cross-task supervision and cross-task consistency within a unified SSL pipeline.
  • Figure 3: 3D Visual comparison of segmentation results with varying label proportions.
  • Figure 4: Visual comparison of results from the ablation study. Green and red regions delineate the outputs and the ground truth, respectively.
  • Figure 5: Visualization of different distance transform methods. Green and red contours denote transformed and label contours, respectively.
  • ...and 2 more figures