Table of Contents
Fetching ...

PULSE: A Unified Multi-Task Architecture for Cardiac Segmentation, Diagnosis, and Few-Shot Cross-Modality Clinical Adaptation

Hania Ghouse, Maryam Alsharqi, Farhad R. Nezami, Muzammil Behzad

TL;DR

PULSE introduces a unified transformer-based framework that simultaneously performs ventricular segmentation, cardiomyopathy classification, and clinically grounded output generation across MRI and ultrasound modalities. It leverages a self-supervised DINOv2 backbone and a 4-scale pyramid decoder to learn robust, cross-domain cardiac priors, trained with a composite loss that couples segmentation and diagnosis. The approach achieves strong segmentation and classification on ACDC, generalizes to unseen MRI cohorts, and adapts to ultrasound with few-shot fine-tuning, demonstrating a foundation-style, cross-modality cardiac analysis pipeline. This work advances practical deployment by reducing annotation needs and enabling end-to-end clinical reasoning from pixels to narratives and indices.

Abstract

Cardiac image analysis remains fragmented across tasks: anatomical segmentation, disease classification, and grounded clinical report generation are typically handled by separate networks trained under different data regimes. No existing framework unifies these objectives within a single architecture while retaining generalization across imaging modalities and datasets. We introduce PULSE, a multi-task vision-language framework built on self-supervised representations and optimized through a composite supervision strategy that balances region overlap learning, pixel wise classification fidelity, and boundary aware IoU refinement. A multi-scale token reconstruction decoder enables anatomical segmentation, while shared global representations support disease classification and clinically grounded text output allowing the model to transition from pixels to structures and finally clinical reasoning within one architecture. Unlike prior task-specific pipelines, PULSE learns task-invariant cardiac priors, generalizes robustly across datasets, and can be adapted to new imaging modalities with minimal supervision. This moves the field closer to a scalable, foundation style cardiac analysis framework.

PULSE: A Unified Multi-Task Architecture for Cardiac Segmentation, Diagnosis, and Few-Shot Cross-Modality Clinical Adaptation

TL;DR

PULSE introduces a unified transformer-based framework that simultaneously performs ventricular segmentation, cardiomyopathy classification, and clinically grounded output generation across MRI and ultrasound modalities. It leverages a self-supervised DINOv2 backbone and a 4-scale pyramid decoder to learn robust, cross-domain cardiac priors, trained with a composite loss that couples segmentation and diagnosis. The approach achieves strong segmentation and classification on ACDC, generalizes to unseen MRI cohorts, and adapts to ultrasound with few-shot fine-tuning, demonstrating a foundation-style, cross-modality cardiac analysis pipeline. This work advances practical deployment by reducing annotation needs and enabling end-to-end clinical reasoning from pixels to narratives and indices.

Abstract

Cardiac image analysis remains fragmented across tasks: anatomical segmentation, disease classification, and grounded clinical report generation are typically handled by separate networks trained under different data regimes. No existing framework unifies these objectives within a single architecture while retaining generalization across imaging modalities and datasets. We introduce PULSE, a multi-task vision-language framework built on self-supervised representations and optimized through a composite supervision strategy that balances region overlap learning, pixel wise classification fidelity, and boundary aware IoU refinement. A multi-scale token reconstruction decoder enables anatomical segmentation, while shared global representations support disease classification and clinically grounded text output allowing the model to transition from pixels to structures and finally clinical reasoning within one architecture. Unlike prior task-specific pipelines, PULSE learns task-invariant cardiac priors, generalizes robustly across datasets, and can be adapted to new imaging modalities with minimal supervision. This moves the field closer to a scalable, foundation style cardiac analysis framework.

Paper Structure

This paper contains 55 sections, 29 equations, 10 figures, 12 tables.

Figures (10)

  • Figure 1: Overview of the proposed PULSE framework for cardiac MRI segmentation, classification, and clinical report generation.
  • Figure 2: Dice across ACDC disease groups. Ventricular dilation (DCM) amplifies cavity clarity; scar-thinning (MINF) increases boundary ambiguity.
  • Figure 3: Four-way radar comparison of loss supervision strategies. Hybrid Dice+CE+Lovász delivers consistently superior segmentation and classification fidelity.
  • Figure 4: Progressive improvement in Mean Dice across cumulative ablation steps. Each enhancement contributes to boundary stability and overall segmentation quality, with Lovász producing the final performance peak.
  • Figure 5: Camus Few Shot Transfer
  • ...and 5 more figures