Table of Contents
Fetching ...

Shift-Equivariant Complex-Valued Convolutional Neural Networks

Quentin Gabot, Teck-Yian Lim, Jérémy Fix, Joana Frontera-Pons, Chengfang Ren, Jean-Philippe Ovarlez

TL;DR

This work extends provably shift-equivariant convolutional networks to complex-valued data, introducing a learnable complex-to-real projection before the Gumbel Softmax to enable end-to-end training of polyphase downsampling/upsampling in the complex domain. It establishes theoretical extensions and three key propositions for complex shift-equivariant design, and proposes both implicit and explicit projection strategies (e.g., PolyDec, MLP) to map $oldsymbol{z}\,\in\mathbb{C}^N$ to real logits. Empirically, complex-valued CVNNs with Learnable Polyphase Sampling (LPS) outperform real-valued and non-equivariant baselines on classification, segmentation, and reconstruction tasks on PolSAR datasets, with PolyDec often delivering a favorable balance between performance and compute. The results demonstrate that a properly engineered complex shift-equivariant framework can harness both amplitude and phase information in PolSAR data, yielding robust, invariant/equivariant representations for remote sensing applications.

Abstract

Convolutional neural networks have shown remarkable performance in recent years on various computer vision problems. However, the traditional convolutional neural network architecture lacks a critical property: shift equivariance and invariance, broken by downsampling and upsampling operations. Although data augmentation techniques can help the model learn the latter property empirically, a consistent and systematic way to achieve this goal is by designing downsampling and upsampling layers that theoretically guarantee these properties by construction. Adaptive Polyphase Sampling (APS) introduced the cornerstone for shift invariance, later extended to shift equivariance with Learnable Polyphase up/downsampling (LPS) applied to real-valued neural networks. In this paper, we extend the work on LPS to complex-valued neural networks both from a theoretical perspective and with a novel building block of a projection layer from $\mathbb{C}$ to $\mathbb{R}$ before the Gumbel Softmax. We finally evaluate this extension on several computer vision problems, specifically for either the invariance property in classification tasks or the equivariance property in both reconstruction and semantic segmentation problems, using polarimetric Synthetic Aperture Radar images.

Shift-Equivariant Complex-Valued Convolutional Neural Networks

TL;DR

This work extends provably shift-equivariant convolutional networks to complex-valued data, introducing a learnable complex-to-real projection before the Gumbel Softmax to enable end-to-end training of polyphase downsampling/upsampling in the complex domain. It establishes theoretical extensions and three key propositions for complex shift-equivariant design, and proposes both implicit and explicit projection strategies (e.g., PolyDec, MLP) to map to real logits. Empirically, complex-valued CVNNs with Learnable Polyphase Sampling (LPS) outperform real-valued and non-equivariant baselines on classification, segmentation, and reconstruction tasks on PolSAR datasets, with PolyDec often delivering a favorable balance between performance and compute. The results demonstrate that a properly engineered complex shift-equivariant framework can harness both amplitude and phase information in PolSAR data, yielding robust, invariant/equivariant representations for remote sensing applications.

Abstract

Convolutional neural networks have shown remarkable performance in recent years on various computer vision problems. However, the traditional convolutional neural network architecture lacks a critical property: shift equivariance and invariance, broken by downsampling and upsampling operations. Although data augmentation techniques can help the model learn the latter property empirically, a consistent and systematic way to achieve this goal is by designing downsampling and upsampling layers that theoretically guarantee these properties by construction. Adaptive Polyphase Sampling (APS) introduced the cornerstone for shift invariance, later extended to shift equivariance with Learnable Polyphase up/downsampling (LPS) applied to real-valued neural networks. In this paper, we extend the work on LPS to complex-valued neural networks both from a theoretical perspective and with a novel building block of a projection layer from to before the Gumbel Softmax. We finally evaluate this extension on several computer vision problems, specifically for either the invariance property in classification tasks or the equivariance property in both reconstruction and semantic segmentation problems, using polarimetric Synthetic Aperture Radar images.

Paper Structure

This paper contains 38 sections, 41 equations, 10 figures, 6 tables.

Figures (10)

  • Figure 1: Proposed complex-valued extension to the shift-equivariant model first introduced in rojas2022learnable. During training, we retain the weights sharing, pooling operations, and stochasticity as in rojas2022learnable. We must project tensors to $\mathbb{R}$ before computing the Gumbel Softmax (or apply the Gumbel Softmax independently on the real and imaginary parts) to handle complex-valued weights and inputs.
  • Figure 2: $H-\alpha$ plane separated into areas 1 to 9, each corresponding to a specific scattering mechanism, with the entropy at the x-axis and the scattering angle at the y-axis. The black line represents the boundary of physically possible $H-\alpha$ couples.
  • Figure 3: Confusion matrix of a CVNN LPS $\mathrm{MLP}$ for classification on S1SLC'_CVDL
  • Figure 4: Confusion matrix of a CVNN LPF for classification on S1SLC'_CVDL
  • Figure 5: Results obtained with a CVNN LPS $\mathrm{PolyDec}$ for semantic segmentation on PolSF. Left: ground truth segmentation mask. Middle: the complete prediction (without a mask for the unlabeled class). Right: prediction (with a mask for the unlabeled class). Bottom: confusion matrix between ground truth and prediction
  • ...and 5 more figures

Theorems & Definitions (3)

  • proof
  • proof
  • proof