Waves of Imagination: Unconditional Spectrogram Generation using Diffusion Architectures
Rahul Vanukuri, Shafi Ullah Khan, Talip Tolga Sarı, Gokhan Secinti, Diego Patiño, Debashri Roy
TL;DR
This work tackles the data scarcity barrier in RF radar detection within CBRS by introducing a diffusion-based approach to synthesize realistic spectrograms containing LTE, 5G, and radar signals. Using a U-Net–based diffusion model trained with a LSUN-pretrained backbone and fine-tuned on the Waldo dataset, the method outputs 256x256x3 spectrograms that are evaluated with PSNR and SSIM. The synthetic data enable effective pre-training that accelerates downstream radar-detection training, achieving a 51.5% faster convergence when adapted to real data. The approach demonstrates high practical potential for scalable, diverse RF data generation and improved detector training in shared-spectrum environments. Overall, it provides a principled path from synthetic diffusion-generated spectrograms to improved real-world performance, with future work extending signal types and controllable generation.
Abstract
The growing demand for effective spectrum management and interference mitigation in shared bands, such as the Citizens Broadband Radio Service (CBRS), requires robust radar detection algorithms to protect the military transmission from interference due to commercial wireless transmission. These algorithms, in turn, depend on large, diverse, and carefully labeled spectrogram datasets. However, collecting and annotating real-world radio frequency (RF) spectrogram data remains a significant challenge, as radar signals are rare, and their occurrences are infrequent. This challenge makes the creation of balanced datasets difficult, limiting the performance and generalizability of AI models in this domain. To address this critical issue, we propose a diffusion-based generative model for synthesizing realistic and diverse spectrograms of five distinct categories that integrate LTE, 5G, and radar signals within the CBRS band. We conduct a structural and statistical fidelity analysis of the generated spectrograms using widely accepted evaluation metrics Structural Similarity Index Measure (SSIM) and Peak Signal-to-Noise Ratio (PSNR), to quantify their divergence from the training data. Furthermore, we demonstrate that pre-training on the generated spectrograms significantly improves training efficiency on a real-world radar detection task by enabling $51.5\%$ faster convergence.
