Table of Contents
Fetching ...

Steinmetz Neural Networks for Complex-Valued Data

Shyam Venkatasubramanian, Ali Pezeshki, Vahid Tarokh

TL;DR

The paper introduces Steinmetz Neural Networks, a real-valued, multi-view architecture that processes complex-valued data via separate real and imaginary subnetworks before joint fusion, aiming for interpretable latent representations and improved generalization. It formalizes a consistency constraint linking the two latent views, and proposes the Analytic Neural Network by enforcing a Hilbert-transform-based relationship to induce orthogonality between latent components. An information-theoretic analysis suggests that such consistency can tighten generalization bounds, and the Hilbert consistency penalty is presented as a practical implementation. Empirical results on CV-MNIST, CV-CIFAR, CV-FSDD, and RASPNet show that Steinmetz and Analytic networks outperform traditional RVNNs and CVNNs, with Analytic achieving the strongest performance and enhanced noise robustness. The work highlights a promising approach for efficient, robust complex-valued data processing with potential for broader applicability and theoretical guarantees.

Abstract

We introduce a new approach to processing complex-valued data using DNNs consisting of parallel real-valued subnetworks with coupled outputs. Our proposed class of architectures, referred to as Steinmetz Neural Networks, incorporates multi-view learning to construct more interpretable representations in the latent space. Moreover, we present the Analytic Neural Network, which incorporates a consistency penalty that encourages analytic signal representations in the latent space of the Steinmetz neural network. This penalty enforces a deterministic and orthogonal relationship between the real and imaginary components. Using an information-theoretic construction, we demonstrate that the generalization gap upper bound posited by the analytic neural network is lower than that of the general class of Steinmetz neural networks. Our numerical experiments depict the improved performance and robustness to additive noise, afforded by our proposed networks on benchmark datasets and synthetic examples.

Steinmetz Neural Networks for Complex-Valued Data

TL;DR

The paper introduces Steinmetz Neural Networks, a real-valued, multi-view architecture that processes complex-valued data via separate real and imaginary subnetworks before joint fusion, aiming for interpretable latent representations and improved generalization. It formalizes a consistency constraint linking the two latent views, and proposes the Analytic Neural Network by enforcing a Hilbert-transform-based relationship to induce orthogonality between latent components. An information-theoretic analysis suggests that such consistency can tighten generalization bounds, and the Hilbert consistency penalty is presented as a practical implementation. Empirical results on CV-MNIST, CV-CIFAR, CV-FSDD, and RASPNet show that Steinmetz and Analytic networks outperform traditional RVNNs and CVNNs, with Analytic achieving the strongest performance and enhanced noise robustness. The work highlights a promising approach for efficient, robust complex-valued data processing with potential for broader applicability and theoretical guarantees.

Abstract

We introduce a new approach to processing complex-valued data using DNNs consisting of parallel real-valued subnetworks with coupled outputs. Our proposed class of architectures, referred to as Steinmetz Neural Networks, incorporates multi-view learning to construct more interpretable representations in the latent space. Moreover, we present the Analytic Neural Network, which incorporates a consistency penalty that encourages analytic signal representations in the latent space of the Steinmetz neural network. This penalty enforces a deterministic and orthogonal relationship between the real and imaginary components. Using an information-theoretic construction, we demonstrate that the generalization gap upper bound posited by the analytic neural network is lower than that of the general class of Steinmetz neural networks. Our numerical experiments depict the improved performance and robustness to additive noise, afforded by our proposed networks on benchmark datasets and synthetic examples.
Paper Structure (28 sections, 11 theorems, 40 equations, 6 figures, 3 tables)

This paper contains 28 sections, 11 theorems, 40 equations, 6 figures, 3 tables.

Key Result

Corollary 3.1

Let $\mathbf{\Sigma_J}$ be the matrix of covariances of $X_R^m$, $X_I^m$ from joint-only processing, and let $\mathbf{\Sigma_S}$ be the matrix of covariances of $X_R^m$, $X_I^m$ from separate-then-joint processing. It follows that:

Figures (6)

  • Figure 1: Classical RVNN Markov chain (left) and practical implementation (right)
  • Figure 2: Steinmetz neural network Markov chain (left) and practical implementation (right)
  • Figure 3: Test performance comparison on CV-MNIST $(M = 500)$, CV-CIFAR-10 $(M = 50{,}000)$, CV-CIFAR-100 $(M = 50{,}000)$, CV-FSDD $(M = 2{,}700)$, and RASPNet $(M = 20{,}000)$ through CVNN, RVNN, Steinmetz neural network, and analytic neural network. The x-axis indicates the training epochs, while the y-axis indicates the test performance (classification accuracy and mean squared error).
  • Figure 4: Noise robustness test performance on CV-MNIST ($M = 60{,}000$) and CV-CIFAR-10 ($M = 50{,}000$) through CVNN, RVNN, Steinmetz neural network, and analytic neural network. The x-axis is the scaling factor, $\eta$, for the additive complex normal noise, while the y-axis indicates the classification accuracy.
  • Figure 5: Real-valued architectures for complex-valued data processing.
  • ...and 1 more figures

Theorems & Definitions (12)

  • Corollary 3.1
  • Theorem 4.1
  • Corollary 4.2
  • Theorem 4.3
  • Theorem 4.4
  • Corollary 5.1
  • Definition 5.2
  • Lemma A.1
  • Lemma A.2
  • Lemma A.3
  • ...and 2 more