Graph Conditioned Diffusion for Controllable Histopathology Image Generation
Sarah Cechnicka, Matthew Baugh, Weitong Zhang, Mischa Dombrowski, Zhe Li, Johannes C. Paetzold, Candice Roufosse, Bernhard Kainz
TL;DR
The paper tackles the challenge of controllable diffusion-based generation in histopathology by introducing Graph Conditioned Diffusion (GCD), which uses graph proxy representations of tissue structures to steer diffusion via a graph-tokenized transformer and adjacency-aware attention. By constructing graphs from ground-truth masks and embedding them into a cascaded diffusion framework (64×64 upscaled to 256×256 and 1024×1024) with targeted graph interventions, the approach achieves diverse yet clinically faithful synthetic histopathology data. Key findings show improved diversity (IP/IR) and competitive downstream segmentation performance compared to conventional diffusion methods, with simple graph edits outperforming more complex transformations in many cases. The method holds potential for privacy-preserving data sharing and enhanced training data quality for diagnostic tasks by providing controllable, distribution-aligned synthetic samples.
Abstract
Recent advances in Diffusion Probabilistic Models (DPMs) have set new standards in high-quality image synthesis. Yet, controlled generation remains challenging, particularly in sensitive areas such as medical imaging. Medical images feature inherent structure such as consistent spatial arrangement, shape or texture, all of which are critical for diagnosis. However, existing DPMs operate in noisy latent spaces that lack semantic structure and strong priors, making it difficult to ensure meaningful control over generated content. To address this, we propose graph-based object-level representations for Graph-Conditioned-Diffusion. Our approach generates graph nodes corresponding to each major structure in the image, encapsulating their individual features and relationships. These graph representations are processed by a transformer module and integrated into a diffusion model via the text-conditioning mechanism, enabling fine-grained control over generation. We evaluate this approach using a real-world histopathology use case, demonstrating that our generated data can reliably substitute for annotated patient data in downstream segmentation tasks. The code is available here.
