Table of Contents
Fetching ...

Could We Generate Cytology Images from Histopathology Images? An Empirical Study

Soumyajyoti Dey, Sukanta Chakraborty, Utso Guha Roy, Nibaran Das

TL;DR

The study addresses data scarcity in breast cancer cytology by exploring synthetic cytology generation from histopathology images using unpaired image-to-image translation. It empirically compares CycleGAN, with mappings $G: A \rightarrow B$ and $F: B \rightarrow A$ and losses $Loss_G$ and $Loss_{cyc}$, against Neural Style Transfer, applied to BreakHis histopathology and JUCYT cytology datasets. Results indicate CycleGAN-produced cytology better matches real cytology distributions (lower $FID$ and $KID$) than histology, while Neural Style Transfer mainly captures styling rather than nuclear morphology; some samples fail to preserve benign/malignant semantics. The work provides practical insights into data augmentation for medical imaging, highlights limitations such as finite synthetic samples and mislabeling risks, and suggests transfer-learning-based generative approaches for improved cross-domain synthesis.

Abstract

Automation in medical imaging is quite challenging due to the unavailability of annotated datasets and the scarcity of domain experts. In recent years, deep learning techniques have solved some complex medical imaging tasks like disease classification, important object localization, segmentation, etc. However, most of the task requires a large amount of annotated data for their successful implementation. To mitigate the shortage of data, different generative models are proposed for data augmentation purposes which can boost the classification performances. For this, different synthetic medical image data generation models are developed to increase the dataset. Unpaired image-to-image translation models here shift the source domain to the target domain. In the breast malignancy identification domain, FNAC is one of the low-cost low-invasive modalities normally used by medical practitioners. But availability of public datasets in this domain is very poor. Whereas, for automation of cytology images, we need a large amount of annotated data. Therefore synthetic cytology images are generated by translating breast histopathology samples which are publicly available. In this study, we have explored traditional image-to-image transfer models like CycleGAN, and Neural Style Transfer. Further, it is observed that the generated cytology images are quite similar to real breast cytology samples by measuring FID and KID scores.

Could We Generate Cytology Images from Histopathology Images? An Empirical Study

TL;DR

The study addresses data scarcity in breast cancer cytology by exploring synthetic cytology generation from histopathology images using unpaired image-to-image translation. It empirically compares CycleGAN, with mappings and and losses and , against Neural Style Transfer, applied to BreakHis histopathology and JUCYT cytology datasets. Results indicate CycleGAN-produced cytology better matches real cytology distributions (lower and ) than histology, while Neural Style Transfer mainly captures styling rather than nuclear morphology; some samples fail to preserve benign/malignant semantics. The work provides practical insights into data augmentation for medical imaging, highlights limitations such as finite synthetic samples and mislabeling risks, and suggests transfer-learning-based generative approaches for improved cross-domain synthesis.

Abstract

Automation in medical imaging is quite challenging due to the unavailability of annotated datasets and the scarcity of domain experts. In recent years, deep learning techniques have solved some complex medical imaging tasks like disease classification, important object localization, segmentation, etc. However, most of the task requires a large amount of annotated data for their successful implementation. To mitigate the shortage of data, different generative models are proposed for data augmentation purposes which can boost the classification performances. For this, different synthetic medical image data generation models are developed to increase the dataset. Unpaired image-to-image translation models here shift the source domain to the target domain. In the breast malignancy identification domain, FNAC is one of the low-cost low-invasive modalities normally used by medical practitioners. But availability of public datasets in this domain is very poor. Whereas, for automation of cytology images, we need a large amount of annotated data. Therefore synthetic cytology images are generated by translating breast histopathology samples which are publicly available. In this study, we have explored traditional image-to-image transfer models like CycleGAN, and Neural Style Transfer. Further, it is observed that the generated cytology images are quite similar to real breast cytology samples by measuring FID and KID scores.
Paper Structure (9 sections, 6 figures, 4 tables)

This paper contains 9 sections, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Examples of Real Breast Histopathology samples(from BreakHis Dataset). First row: Benign samples and Second row: Malignant samples.
  • Figure 2: Examples of Real Breast Cytology samples(from JUCYT Dataset). First row: Benign samples and Second row: Malignant samples.
  • Figure 3: Synthetic Benign Cytology images by CycleGAN model. The second row indicates the histopathology images(Source Domain) and the first row indicates the corresponding synthetic cytology images(Target Domain)
  • Figure 4: Synthetic Malignant Cytology images by CycleGAN model. The second row indicates the histopathology images(Source Domain) and the first row indicates the corresponding synthetic cytology images(Target Domain)
  • Figure 5: Synthetic Benign Cytology images by Neural Style Transfer model. The second row indicates the histopathology images(Content Image) and the first row indicates the corresponding synthetic cytology images
  • ...and 1 more figures