Enhancing crop classification accuracy by synthetic SAR-Optical data generation using deep learning
Ali Mirzaei, Hossein Bagheri, Iman Khosravi
TL;DR
CTGAN-based synthetic data generation is applied to SAR–optical fusion features to address minority-class data scarcity in crop classification. The study demonstrates that CTGAN-produced samples yield higher minority-class sensitivity and balanced overall performance across RF, XGBoost, and KNN, outperforming SMOTE, ROS, and RUS. Using a Winnipeg study area with RapidEye and UAVSAR-derived features, the approach preserves data distributions while expanding minority-class coverage. This distribution-aware augmentation has practical potential to improve reliable agricultural mapping in regions with imbalanced crop distributions.
Abstract
Crop classification using remote sensing data has emerged as a prominent research area in recent decades. Studies have demonstrated that fusing SAR and optical images can significantly enhance the accuracy of classification. However, a major challenge in this field is the limited availability of training data, which adversely affects the performance of classifiers. In agricultural regions, the dominant crops typically consist of one or two specific types, while other crops are scarce. Consequently, when collecting training samples to create a map of agricultural products, there is an abundance of samples from the dominant crops, forming the majority classes. Conversely, samples from other crops are scarce, representing the minority classes. Addressing this issue requires overcoming several challenges and weaknesses associated with traditional data generation methods. These methods have been employed to tackle the imbalanced nature of the training data. Nevertheless, they still face limitations in effectively handling the minority classes. Overall, the issue of inadequate training data, particularly for minority classes, remains a hurdle that traditional methods struggle to overcome. In this research, We explore the effectiveness of conditional tabular generative adversarial network (CTGAN) as a synthetic data generation method based on a deep learning network, in addressing the challenge of limited training data for minority classes in crop classification using the fusion of SAR-optical data. Our findings demonstrate that the proposed method generates synthetic data with higher quality that can significantly increase the number of samples for minority classes leading to better performance of crop classifiers.
