Table of Contents
Fetching ...

DenoMAE2.0: Improving Denoising Masked Autoencoders by Classifying Local Patches

Atik Faysal, Mohammad Rostami, Taha Boushine, Reihaneh Gh. Roshan, Huaxia Wang, Nikhil Muralidhar

TL;DR

DenoMAE2.0 tackles the challenge of learning robust representations for wireless AMC under noise and limited labeled data by combining denoising reconstruction with a position-aware local patch classification objective. It uses a three-component architecture (encoder, denoising decoder, local-patch classifier) and trains with a joint loss L = $\lambda_{rec} L_{rec} + \lambda_{cls} L_{cls}$, where $\lambda_{rec}=1.0$ and $\lambda_{cls}=0.1$, on patch-embedded, masked constellations mapped to $224 \times 224$ RGB-like inputs. Empirically, it achieves superior denoising (SSIM/PSNR) and downstream modulation-classification accuracy, including notable transfer gains on RadioML (e.g., $11.83\%$ at 20 dB and $16.55\%$ at 10 dB over DenoMAE), and demonstrates robustness across SNRs and data regimes. Ablation studies reveal that jointly optimizing reconstruction and auxiliary patch-classification losses, along with architectural choices (MLP size, decoders), is crucial for achieving these gains, indicating strong potential for practical, data-efficient AMC systems.

Abstract

We introduce DenoMAE2.0, an enhanced denoising masked autoencoder that integrates a local patch classification objective alongside traditional reconstruction loss to improve representation learning and robustness. Unlike conventional Masked Autoencoders (MAE), which focus solely on reconstructing missing inputs, DenoMAE2.0 introduces position-aware classification of unmasked patches, enabling the model to capture fine-grained local features while maintaining global coherence. This dual-objective approach is particularly beneficial in semi-supervised learning for wireless communication, where high noise levels and data scarcity pose significant challenges. We conduct extensive experiments on modulation signal classification across a wide range of signal-to-noise ratios (SNRs), from extremely low to moderately high conditions and in a low data regime. Our results demonstrate that DenoMAE2.0 surpasses its predecessor, Deno-MAE, and other baselines in both denoising quality and downstream classification accuracy. DenoMAE2.0 achieves a 1.1% improvement over DenoMAE on our dataset and 11.83%, 16.55% significant improved accuracy gains on the RadioML benchmark, over DenoMAE, for constellation diagram classification of modulation signals.

DenoMAE2.0: Improving Denoising Masked Autoencoders by Classifying Local Patches

TL;DR

DenoMAE2.0 tackles the challenge of learning robust representations for wireless AMC under noise and limited labeled data by combining denoising reconstruction with a position-aware local patch classification objective. It uses a three-component architecture (encoder, denoising decoder, local-patch classifier) and trains with a joint loss L = , where and , on patch-embedded, masked constellations mapped to RGB-like inputs. Empirically, it achieves superior denoising (SSIM/PSNR) and downstream modulation-classification accuracy, including notable transfer gains on RadioML (e.g., at 20 dB and at 10 dB over DenoMAE), and demonstrates robustness across SNRs and data regimes. Ablation studies reveal that jointly optimizing reconstruction and auxiliary patch-classification losses, along with architectural choices (MLP size, decoders), is crucial for achieving these gains, indicating strong potential for practical, data-efficient AMC systems.

Abstract

We introduce DenoMAE2.0, an enhanced denoising masked autoencoder that integrates a local patch classification objective alongside traditional reconstruction loss to improve representation learning and robustness. Unlike conventional Masked Autoencoders (MAE), which focus solely on reconstructing missing inputs, DenoMAE2.0 introduces position-aware classification of unmasked patches, enabling the model to capture fine-grained local features while maintaining global coherence. This dual-objective approach is particularly beneficial in semi-supervised learning for wireless communication, where high noise levels and data scarcity pose significant challenges. We conduct extensive experiments on modulation signal classification across a wide range of signal-to-noise ratios (SNRs), from extremely low to moderately high conditions and in a low data regime. Our results demonstrate that DenoMAE2.0 surpasses its predecessor, Deno-MAE, and other baselines in both denoising quality and downstream classification accuracy. DenoMAE2.0 achieves a 1.1% improvement over DenoMAE on our dataset and 11.83%, 16.55% significant improved accuracy gains on the RadioML benchmark, over DenoMAE, for constellation diagram classification of modulation signals.

Paper Structure

This paper contains 31 sections, 4 equations, 9 figures, 12 tables.

Figures (9)

  • Figure 1: An example figure.
  • Figure 2: Reconstruction performance comparison between DenoMAE and DenoMAE2.0
  • Figure 3: Latent representation visualization using t-SNE. From left to right: (1) DenoMAE without masking, (2) DenoMAE2.0 without masking, (3) DenoMAE with 0.75% masking, and (4) DenoMAE2.0 with 0.75% masking.
  • Figure 4: Confusion matrix for DenoMAE2.0 downstream classification
  • Figure 5: Epoch finetuning.
  • ...and 4 more figures