Table of Contents
Fetching ...

Waves of Imagination: Unconditional Spectrogram Generation using Diffusion Architectures

Rahul Vanukuri, Shafi Ullah Khan, Talip Tolga Sarı, Gokhan Secinti, Diego Patiño, Debashri Roy

TL;DR

This work tackles the data scarcity barrier in RF radar detection within CBRS by introducing a diffusion-based approach to synthesize realistic spectrograms containing LTE, 5G, and radar signals. Using a U-Net–based diffusion model trained with a LSUN-pretrained backbone and fine-tuned on the Waldo dataset, the method outputs 256x256x3 spectrograms that are evaluated with PSNR and SSIM. The synthetic data enable effective pre-training that accelerates downstream radar-detection training, achieving a 51.5% faster convergence when adapted to real data. The approach demonstrates high practical potential for scalable, diverse RF data generation and improved detector training in shared-spectrum environments. Overall, it provides a principled path from synthetic diffusion-generated spectrograms to improved real-world performance, with future work extending signal types and controllable generation.

Abstract

The growing demand for effective spectrum management and interference mitigation in shared bands, such as the Citizens Broadband Radio Service (CBRS), requires robust radar detection algorithms to protect the military transmission from interference due to commercial wireless transmission. These algorithms, in turn, depend on large, diverse, and carefully labeled spectrogram datasets. However, collecting and annotating real-world radio frequency (RF) spectrogram data remains a significant challenge, as radar signals are rare, and their occurrences are infrequent. This challenge makes the creation of balanced datasets difficult, limiting the performance and generalizability of AI models in this domain. To address this critical issue, we propose a diffusion-based generative model for synthesizing realistic and diverse spectrograms of five distinct categories that integrate LTE, 5G, and radar signals within the CBRS band. We conduct a structural and statistical fidelity analysis of the generated spectrograms using widely accepted evaluation metrics Structural Similarity Index Measure (SSIM) and Peak Signal-to-Noise Ratio (PSNR), to quantify their divergence from the training data. Furthermore, we demonstrate that pre-training on the generated spectrograms significantly improves training efficiency on a real-world radar detection task by enabling $51.5\%$ faster convergence.

Waves of Imagination: Unconditional Spectrogram Generation using Diffusion Architectures

TL;DR

This work tackles the data scarcity barrier in RF radar detection within CBRS by introducing a diffusion-based approach to synthesize realistic spectrograms containing LTE, 5G, and radar signals. Using a U-Net–based diffusion model trained with a LSUN-pretrained backbone and fine-tuned on the Waldo dataset, the method outputs 256x256x3 spectrograms that are evaluated with PSNR and SSIM. The synthetic data enable effective pre-training that accelerates downstream radar-detection training, achieving a 51.5% faster convergence when adapted to real data. The approach demonstrates high practical potential for scalable, diverse RF data generation and improved detector training in shared-spectrum environments. Overall, it provides a principled path from synthetic diffusion-generated spectrograms to improved real-world performance, with future work extending signal types and controllable generation.

Abstract

The growing demand for effective spectrum management and interference mitigation in shared bands, such as the Citizens Broadband Radio Service (CBRS), requires robust radar detection algorithms to protect the military transmission from interference due to commercial wireless transmission. These algorithms, in turn, depend on large, diverse, and carefully labeled spectrogram datasets. However, collecting and annotating real-world radio frequency (RF) spectrogram data remains a significant challenge, as radar signals are rare, and their occurrences are infrequent. This challenge makes the creation of balanced datasets difficult, limiting the performance and generalizability of AI models in this domain. To address this critical issue, we propose a diffusion-based generative model for synthesizing realistic and diverse spectrograms of five distinct categories that integrate LTE, 5G, and radar signals within the CBRS band. We conduct a structural and statistical fidelity analysis of the generated spectrograms using widely accepted evaluation metrics Structural Similarity Index Measure (SSIM) and Peak Signal-to-Noise Ratio (PSNR), to quantify their divergence from the training data. Furthermore, we demonstrate that pre-training on the generated spectrograms significantly improves training efficiency on a real-world radar detection task by enabling faster convergence.

Paper Structure

This paper contains 11 sections, 6 equations, 7 figures.

Figures (7)

  • Figure 1: Synthetic spectrogram generation using U-Net ronneberger2015u based diffusion architectures. The LSUN yu2015lsun and Waldo Waldo datasets are publicly available.
  • Figure 2: Diffusion model for generating spectrograms. A U-Net-based network ronneberger2015u reverses the noise process to synthesize realistic radar, 5G, and LTE spectrograms.
  • Figure 3: Generated spectrogram samples showing the four different categories: LTE, 5G, and Radar combinations.
  • Figure 4: Comparison of 20 randomly generated spectrograms against all the spectrograms of Waldo dataset, presented with 95% confidence intervals. The PSNR of 10.36 dB (range 0 dB (least similar) to $\geq$30 dB (more similar)) and SSIM of mean 0.29 (range 0 (least similar) to 1 (most similar)) show significant difference in the generated spectrograms from the spectrograms of the Waldo dataset Waldo.
  • Figure 5: Comparative visual analysis of generated spectrograms versus representative samples from the Waldo dataset. The examples include the top three best and worst matches based on PSNR and SSIM.
  • ...and 2 more figures