Modified CycleGAN for the synthesization of samples for wheat head segmentation
Jaden Myers, Keyhan Najafian, Farhad Maleki, Katie Ovens
TL;DR
The paper tackles the challenge of scarce annotated data for crop-segmentation by generating a large synthetic dataset and bridging the domain gap to real images with a segmentation-aware CycleGAN. It introduces a modified CycleGAN that takes segmentation masks as input to preserve semantic information during translation, producing a translated dataset $\hat{R}$ from synthetic $S$ that closely resembles real imagery. A U-Net-based segmentation model trained on $\hat{R}$ achieves substantial performance gains across internal and external GWHD datasets, with further improvements from a pseudo-labeling fine-tuning step. The approach demonstrates strong potential for scalable, domain-adaptive semantic segmentation in agriculture and could generalize to other crops and densely patterned imagery.
Abstract
Deep learning models have been used for a variety of image processing tasks. However, most of these models are developed through supervised learning approaches, which rely heavily on the availability of large-scale annotated datasets. Developing such datasets is tedious and expensive. In the absence of an annotated dataset, synthetic data can be used for model development; however, due to the substantial differences between simulated and real data, a phenomenon referred to as domain gap, the resulting models often underperform when applied to real data. In this research, we aim to address this challenge by first computationally simulating a large-scale annotated dataset and then using a generative adversarial network (GAN) to fill the gap between simulated and real images. This approach results in a synthetic dataset that can be effectively utilized to train a deep-learning model. Using this approach, we developed a realistic annotated synthetic dataset for wheat head segmentation. This dataset was then used to develop a deep-learning model for semantic segmentation. The resulting model achieved a Dice score of 83.4\% on an internal dataset and Dice scores of 79.6% and 83.6% on two external Global Wheat Head Detection datasets. While we proposed this approach in the context of wheat head segmentation, it can be generalized to other crop types or, more broadly, to images with dense, repeated patterns such as those found in cellular imagery.
