Hyperspectral data augmentation with transformer-based diffusion models
Mattia Ferrari, Lorenzo Bruzzone
TL;DR
This work tackles overfitting in hyperspectral land-cover classification when labeled data are scarce by introducing a guided diffusion-based data augmentation method steered by a lightweight transformer. It advances diffusion modeling for hyperspectral DA through a class-conditioned reverse process, a AdaLN-Zero transformer, and a cosine variance scheduler with a modified delta, paired with a loss that blends noise-prediction MSE with a variational lower bound. Evaluated on PRISMA forest data with 10 classes, the approach achieves the best average and weighted F1-scores compared with traditional augmentation methods and GANs, while displaying stable training and avoiding mode collapse. The results suggest diffusion-based augmentation as a robust, data-efficient strategy for enhancing hyperspectral classification in practical remote sensing applications.
Abstract
The introduction of new generation hyperspectral satellite sensors, combined with advancements in deep learning methodologies, has significantly enhanced the ability to discriminate detailed land-cover classes at medium-large scales. However, a significant challenge in deep learning methods is the risk of overfitting when training networks with small labeled datasets. In this work, we propose a data augmentation technique that leverages a guided diffusion model. To effectively train the model with a limited number of labeled samples and to capture complex patterns in the data, we implement a lightweight transformer network. Additionally, we introduce a modified weighted loss function and an optimized cosine variance scheduler, which facilitate fast and effective training on small datasets. We evaluate the effectiveness of the proposed method on a forest classification task with 10 different forest types using hyperspectral images acquired by the PRISMA satellite. The results demonstrate that the proposed method outperforms other data augmentation techniques in both average and weighted average accuracy. The effectiveness of the method is further highlighted by the stable training behavior of the model, which addresses a common limitation in the practical application of deep generative models for data augmentation.
