Table of Contents
Fetching ...

Style-Extracting Diffusion Models for Semi-Supervised Histopathology Segmentation

Mathias Öttl, Frauke Wilm, Jana Steenpass, Jingna Qiu, Matthias Rübner, Arndt Hartmann, Matthias Beckmann, Peter Fasching, Andreas Maier, Ramona Erber, Bernhard Kainz, Katharina Breininger

TL;DR

This work proposes Style-Extracting Diffusion Models, featuring two conditioning mechanisms which allows to inject style information of previously unseen images during image generation and a content conditioning which can be targeted to a downstream task, e.g., layout for segmentation.

Abstract

Deep learning-based image generation has seen significant advancements with diffusion models, notably improving the quality of generated images. Despite these developments, generating images with unseen characteristics beneficial for downstream tasks has received limited attention. To bridge this gap, we propose Style-Extracting Diffusion Models, featuring two conditioning mechanisms. Specifically, we utilize 1) a style conditioning mechanism which allows to inject style information of previously unseen images during image generation and 2) a content conditioning which can be targeted to a downstream task, e.g., layout for segmentation. We introduce a trainable style encoder to extract style information from images, and an aggregation block that merges style information from multiple style inputs. This architecture enables the generation of images with unseen styles in a zero-shot manner, by leveraging styles from unseen images, resulting in more diverse generations. In this work, we use the image layout as target condition and first show the capability of our method on a natural image dataset as a proof-of-concept. We further demonstrate its versatility in histopathology, where we combine prior knowledge about tissue composition and unannotated data to create diverse synthetic images with known layouts. This allows us to generate additional synthetic data to train a segmentation network in a semi-supervised fashion. We verify the added value of the generated images by showing improved segmentation results and lower performance variability between patients when synthetic images are included during segmentation training. Our code will be made publicly available at [LINK].

Style-Extracting Diffusion Models for Semi-Supervised Histopathology Segmentation

TL;DR

This work proposes Style-Extracting Diffusion Models, featuring two conditioning mechanisms which allows to inject style information of previously unseen images during image generation and a content conditioning which can be targeted to a downstream task, e.g., layout for segmentation.

Abstract

Deep learning-based image generation has seen significant advancements with diffusion models, notably improving the quality of generated images. Despite these developments, generating images with unseen characteristics beneficial for downstream tasks has received limited attention. To bridge this gap, we propose Style-Extracting Diffusion Models, featuring two conditioning mechanisms. Specifically, we utilize 1) a style conditioning mechanism which allows to inject style information of previously unseen images during image generation and 2) a content conditioning which can be targeted to a downstream task, e.g., layout for segmentation. We introduce a trainable style encoder to extract style information from images, and an aggregation block that merges style information from multiple style inputs. This architecture enables the generation of images with unseen styles in a zero-shot manner, by leveraging styles from unseen images, resulting in more diverse generations. In this work, we use the image layout as target condition and first show the capability of our method on a natural image dataset as a proof-of-concept. We further demonstrate its versatility in histopathology, where we combine prior knowledge about tissue composition and unannotated data to create diverse synthetic images with known layouts. This allows us to generate additional synthetic data to train a segmentation network in a semi-supervised fashion. We verify the added value of the generated images by showing improved segmentation results and lower performance variability between patients when synthetic images are included during segmentation training. Our code will be made publicly available at [LINK].
Paper Structure (18 sections, 6 equations, 8 figures, 3 tables)

This paper contains 18 sections, 6 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Synthetic images with defined layouts and styles generated by our proposed Style-Extracting Diffusion Model (STEDM).
  • Figure 2: Overview of our proposed stedm with the semantic layout as content conditioning for the example of histopathological images during training (left) and inference (right). An image $x$ and corresponding layout query $l$ are sampled, as well as 1 to n style queries $s_{q,n}$. A style encoder $\mathcal{E}_{s}$ extracts a style feature vector $v_{s,n}$ for each style image, which an aggregation block $agg$ combines into a final style feature vector $v_{s}$. An ldm is conditioned with the layout query $l$ and the extracted style feature vector $v_{s}$. Synthetic images with unseen styles are generated by taking known layout queries $l$ and combining them with unseen style queries $s_{q,n}$.
  • Figure 3: Image generation results with the flower dataset, for the style transfer baseline baseline_style (left), a semantic conditioned diffusion model (center) and our proposed method trained with augmented images as style source (right). Our method is able to generate flowers with colors that were absent or underrepresented in the training data.
  • Figure 4: Image generation results with the HER2 dataset, for the style transfer baseline baseline_style (left), our proposed method trained with nearby patches as style source (center) and our proposed method trained with multi-patches as style source (right). Note that white represents tumor tissue in the layout images, while black includes all background structures.
  • Figure 5: Image generation results with the CATCH dataset, for the style transfer baseline baseline_style (left), our proposed method trained with nearby patches as style source (center) and our proposed method trained with multi-patches as style source (right). Note that white represents tumor tissue in the layout images, while black includes all background structures.
  • ...and 3 more figures