Ultrasound Image Synthesis Using Generative AI for Lung Ultrasound Detection
Yu-Cheng Chou, Gary Y. Li, Li Chen, Mohsen Zahiri, Naveen Balaraju, Shubham Patil, Bryson Hicks, Nikolai Schnittke, David O. Kessler, Jeffrey Shupp, Maria Parker, Cristiana Baloescu, Christopher Moore, Cynthia Gregory, Kenton Gregory, Balasundar Raju, Jochen Kruecker, Alvin Chen
TL;DR
This work tackles data scarcity and class imbalance in lung ultrasound AI by introducing DiffUltra, a diffusion-based framework that synthesizes whole LUS images with realistic lesion structure and anatomy-aware placement. It leverages a Lesion-anatomy Bank to provide positional and texture guidance through a conditional PMF $P(\Delta X, \Delta Y \mid X, Y)$ and a bank of lesion foregrounds, and conditions a stable diffusion model on a lesion skeleton $S$ and latent features $f$ to generate images via $\hat{I} = Dec(D(f, S))$ in a downmapped latent space. Empirical results show that augmenting real data with DiffUltra synthetic images improves detection performance by $+5.6$ percentage points in lesion-level AP and up to $+25\%$ for rare large consolidations, outperforming mask-based diffusion baselines. Ablation studies confirm the necessity of both structural representation and positional guidance for realism and downstream gains, highlighting the value of geometry-aware data augmentation for ultrasound diagnostics.
Abstract
Developing reliable healthcare AI models requires training with representative and diverse data. In imbalanced datasets, model performance tends to plateau on the more prevalent classes while remaining low on less common cases. To overcome this limitation, we propose DiffUltra, the first generative AI technique capable of synthesizing realistic Lung Ultrasound (LUS) images with extensive lesion variability. Specifically, we condition the generative AI by the introduced Lesion-anatomy Bank, which captures the lesion's structural and positional properties from real patient data to guide the image synthesis.We demonstrate that DiffUltra improves consolidation detection by 5.6% in AP compared to the models trained solely on real patient data. More importantly, DiffUltra increases data diversity and prevalence of rare cases, leading to a 25% AP improvement in detecting rare instances such as large lung consolidations, which make up only 10% of the dataset.
