Table of Contents
Fetching ...

DenoMAE: A Multimodal Autoencoder for Denoising Modulation Signals

Atik Faysal, Taha Boushine, Mohammad Rostami, Reihaneh Gh. Roshan, Huaxia Wang, Nikhil Muralidhar, Avimanyu Sahoo, Yu-Dong Yao

TL;DR

This work tackles the challenge of denoising and classifying modulation signals under strong noise with limited labeled data. It introduces DenoMAE, a multimodal masked autoencoder that treats noise as an explicit modality and uses cross-modal reconstruction to learn robust denoising representations. Empirical results show state-of-the-art automatic modulation classification accuracy with substantially less unlabeled pretraining data and labeled fine-tuning data, along with robust performance across varying SNRs and extrapolation to unseen low-SNR regimes. The approach offers a data-efficient, flexible solution for denoising and modulation recognition in real-world, noise-heavy wireless environments.

Abstract

We propose Denoising Masked Autoencoder (Deno-MAE), a novel multimodal autoencoder framework for denoising modulation signals during pretraining. DenoMAE extends the concept of masked autoencoders by incorporating multiple input modalities, including noise as an explicit modality, to enhance cross-modal learning and improve denoising performance. The network is pre-trained using unlabeled noisy modulation signals and constellation diagrams, effectively learning to reconstruct their equivalent noiseless signals and diagrams. Deno-MAE achieves state-of-the-art accuracy in automatic modulation classification tasks with significantly fewer training samples, demonstrating a 10% reduction in unlabeled pretraining data and a 3% reduction in labeled fine-tuning data compared to existing approaches. Moreover, our model exhibits robust performance across varying signal-to-noise ratios (SNRs) and supports extrapolation on unseen lower SNRs. The results indicate that DenoMAE is an efficient, flexible, and data-efficient solution for denoising and classifying modulation signals in challenging noise-intensive environments.

DenoMAE: A Multimodal Autoencoder for Denoising Modulation Signals

TL;DR

This work tackles the challenge of denoising and classifying modulation signals under strong noise with limited labeled data. It introduces DenoMAE, a multimodal masked autoencoder that treats noise as an explicit modality and uses cross-modal reconstruction to learn robust denoising representations. Empirical results show state-of-the-art automatic modulation classification accuracy with substantially less unlabeled pretraining data and labeled fine-tuning data, along with robust performance across varying SNRs and extrapolation to unseen low-SNR regimes. The approach offers a data-efficient, flexible solution for denoising and modulation recognition in real-world, noise-heavy wireless environments.

Abstract

We propose Denoising Masked Autoencoder (Deno-MAE), a novel multimodal autoencoder framework for denoising modulation signals during pretraining. DenoMAE extends the concept of masked autoencoders by incorporating multiple input modalities, including noise as an explicit modality, to enhance cross-modal learning and improve denoising performance. The network is pre-trained using unlabeled noisy modulation signals and constellation diagrams, effectively learning to reconstruct their equivalent noiseless signals and diagrams. Deno-MAE achieves state-of-the-art accuracy in automatic modulation classification tasks with significantly fewer training samples, demonstrating a 10% reduction in unlabeled pretraining data and a 3% reduction in labeled fine-tuning data compared to existing approaches. Moreover, our model exhibits robust performance across varying signal-to-noise ratios (SNRs) and supports extrapolation on unseen lower SNRs. The results indicate that DenoMAE is an efficient, flexible, and data-efficient solution for denoising and classifying modulation signals in challenging noise-intensive environments.
Paper Structure (21 sections, 7 equations, 4 figures, 2 tables)

This paper contains 21 sections, 7 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: DenoMAE Pretraining Strategy: We apply a random 75% masking (not to scale in the illustration) across all input modalities. The remaining 25% of visible patches are processed by a shared encoder, while each modality utilizes a dedicated decoder to reconstruct its masked patches. Only the encoder is reused for fine-tuning the downstream tasks.
  • Figure 2: Denoised outputs of DenoMAE on unlabeled constellation diagrams at different SNRs during pretraining.
  • Figure 3: Extrapolation-ability of DenoMAE on out-of-bound much lower SNRs.
  • Figure 4: DenoMAE fine-tuned classification accuracy for constellation diagrams at different SNRs.