Orthogonal Constrained Neural Networks for Solving Structured Inverse Eigenvalue Problems

Shuai Zhang; Xuelian Jiang; Hao Qian; Yingxiang Xu

Orthogonal Constrained Neural Networks for Solving Structured Inverse Eigenvalue Problems

Shuai Zhang, Xuelian Jiang, Hao Qian, Yingxiang Xu

TL;DR

The paper tackles algebraic Structured Inverse Eigenvalue Problems (SIEPs) by introducing a unified, unsupervised neural framework that embeds orthogonality constraints via a Stiefel-layer within a multilayer perceptron (SMLP). A unified loss function combines nonnegativity, prescribed-entry, and row-sum constraints, enabling a hard-constraint optimization on the Stiefel manifold $Q \,\in\, \mathcal{O}(n)$. The approach demonstrates strong performance across symmetric NIEP/NIEP with prescribed entries, Euclidean Distance Matrix IEPs, stochastic and generalized stochastic IEPs, and large-scale problems, including graph-theoretic IEPs and microwave-filter network transformations, with notable speedups and high convergence rates. The work provides a versatile, scalable method for solving diverse SIEPs and suggests broad applicability to other orthogonality-constrained problems in numerical linear algebra and engineering.

Abstract

This paper introduces a novel neural network for efficiently solving Structured Inverse Eigenvalue Problems (SIEPs). The main contributions lie in two aspects: firstly, a unified framework is proposed that can handle various SIEPs instances. Particularly, an innovative method for handling nonnegativity constraints is devised using the ReLU function. Secondly, a novel neural network based on multilayer perceptrons, utilizing the Stiefel layer, is designed to efficiently solve SIEP. By incorporating the Stiefel layer through matrix orthogonal decomposition, the orthogonality of similarity transformations is ensured, leading to accurate solutions for SIEPs. Hence, we name this new network Stiefel Multilayer Perceptron (SMLP). Furthermore, SMLP is an unsupervised learning approach with a lightweight structure that is easy to train. Several numerical tests from literature and engineering domains demonstrate the efficiency of SMLP.

Orthogonal Constrained Neural Networks for Solving Structured Inverse Eigenvalue Problems

TL;DR

. The approach demonstrates strong performance across symmetric NIEP/NIEP with prescribed entries, Euclidean Distance Matrix IEPs, stochastic and generalized stochastic IEPs, and large-scale problems, including graph-theoretic IEPs and microwave-filter network transformations, with notable speedups and high convergence rates. The work provides a versatile, scalable method for solving diverse SIEPs and suggests broad applicability to other orthogonality-constrained problems in numerical linear algebra and engineering.

Abstract

Paper Structure (41 sections, 53 equations, 15 figures, 8 tables, 1 algorithm)

This paper contains 41 sections, 53 equations, 15 figures, 8 tables, 1 algorithm.

Introduction
Related works
The solvability of the IEP.
The computability of the IEP.
Establishment of the loss for the IEP.
Neural networks for the IEP in PDEs.
Preliminaries
Classification of matrices
Nonnegative matrix.
Stochastic matrix.
Generalized stochastic matrix.
Euclidean distance matrix (EDM).
Matrix decomposition
QR decomposition.
Singular value decomposition (SVD).
...and 26 more sections

Figures (15)

Figure 1: Classification of SIEPs under consideration.
Figure 2: The structure of the MLP consists of a fully connected network composed of an input layer, hidden layers, and an output layer.
Figure 3: Schematic diagram of SMLP architecture, a typical MLP embedded with orthogonal constraints on Stiefel manifold. In the Stiefel layer, the output features from the previous layer are first linearly combined to generate new feature representations, processed by the orthogonal operator $\boldsymbol{\Psi}$ to strictly satisfy the orthogonal constraints on the Stiefel manifold, and output $\hat{Q}$. The Nonnegativity Constraint is denoted by $loss_{nonneg}:=\frac{1}{2}\| S \circ (Q \Lambda Q^T) - \text{ReLU}(S \circ (Q \Lambda Q^T)) \|_F^2$, the Prescribed Entries Constraint by $loss_{spec}:=\frac{1}{2}\|\Omega - \{1-\kappa_1, \kappa_1 \cdot \text{ReLU}\}(\Gamma \circ (Q \Lambda Q^T))\|_F^2$ , and the Row Sum Constraint by $loss_{row}:=\frac{1}{2}\|\alpha - \beta\|_F^2$. The total $\textbf{Loss} = \kappa_1 \cdot loss_{nonneg} + loss_{spec} + \kappa_2 \cdot loss_{row}$ is obtained by selecting $\kappa_1$ and $\kappa_2$. Updating parameters by minimizing the Loss.
Figure 4: Comparison of (a) Loss with one hidden layer; (b) Loss with two hidden layers for various combinations of different orthogonal decomposition methods and different activation functions.
Figure 5: Comparison of (a) $\text{Loss}_{\text{nonneg}}$ with one hidden layer; (b) $\text{Loss}_{\text{nonneg}}$ with two hidden layers for various combinations of different orthogonal decomposition methods and different activation functions.
...and 10 more figures

Orthogonal Constrained Neural Networks for Solving Structured Inverse Eigenvalue Problems

TL;DR

Abstract

Orthogonal Constrained Neural Networks for Solving Structured Inverse Eigenvalue Problems

Authors

TL;DR

Abstract

Table of Contents

Figures (15)