UI-Styler: Ultrasound Image Style Transfer with Class-Aware Prompts for Cross-Device Diagnosis Using a Frozen Black-Box Inference Network
Nhat-Tuong Do-Tran, Ngoc-Hoang-Lam Le, Ching-Chun Huang
TL;DR
The paper tackles cross-device domain shift in ultrasound imaging by reusing a frozen black-box downstream model and unlabeled data. It introduces UI-Styler, a dual-level style transfer framework that couples a pattern-matching domain-level stylization with class-aware prompting guided by pseudo labels to preserve diagnostic semantics. The architecture employs two ViT encoders, a cross-attention-based pattern-matching module, learnable class prompts, and a lightweight decoder, trained with content/style losses and prompt-guided direction and supervision losses. Across 12 cross-device tasks on four ultrasound datasets, UI-Styler achieves state-of-the-art distribution alignment and improves downstream classification and segmentation metrics, while ablations confirm the value of both pattern-matching and class-aware prompting. This approach enables reliable cross-device ultrasound diagnosis in privacy-sensitive, label-scarce settings.
Abstract
The appearance of ultrasound images varies across acquisition devices, causing domain shifts that degrade the performance of fixed black-box downstream inference models when reused. To mitigate this issue, it is practical to develop unpaired image translation (UIT) methods that effectively align the statistical distributions between source and target domains, particularly under the constraint of a reused inference-blackbox setting. However, existing UIT approaches often overlook class-specific semantic alignment during domain adaptation, resulting in misaligned content-class mappings that can impair diagnostic accuracy. To address this limitation, we propose UI-Styler, a novel ultrasound-specific, class-aware image style transfer framework. UI-Styler leverages a pattern-matching mechanism to transfer texture patterns embedded in the target images onto source images while preserving the source structural content. In addition, we introduce a class-aware prompting strategy guided by pseudo labels of the target domain, which enforces accurate semantic alignment with diagnostic categories. Extensive experiments on ultrasound cross-device tasks demonstrate that UI-Styler consistently outperforms existing UIT methods, achieving state-of-the-art performance in distribution distance and downstream tasks, such as classification and segmentation.
