Orthogonal Transforms in Neural Networks Amount to Effective Regularization

Krzysztof Zając; Wojciech Sopot; Paweł Wachel

Orthogonal Transforms in Neural Networks Amount to Effective Regularization

Krzysztof Zając, Wojciech Sopot, Paweł Wachel

TL;DR

The paper tackles nonlinear system identification by embedding frequency-domain inductive bias into neural networks via a dual-branch architecture that pairs a time-domain path with an orthogonal-transform path, exemplified by a Fourier transform. It proves that both branches are universal approximators and shows that the orthogonal path induces per-parameter gradient scaling, effectively regularizing training. Empirically, the Fourier-based FSNN outperforms non-orthogonal baselines on static frequency-input tasks and provides competitive results on Wiener-Hammerstein and Silverbox benchmarks, illustrating the practical benefits and limitations of the approach. Overall, the work demonstrates that orthogonal transforms can expand neural network capabilities for physical system identification while offering a principled regularization mechanism and a route to task-specialized models.

Abstract

We consider applications of neural networks in nonlinear system identification and formulate a hypothesis that adjusting general network structure by incorporating frequency information or other known orthogonal transform, should result in an efficient neural network retaining its universal properties. We show that such a structure is a universal approximator and that using any orthogonal transform in a proposed way implies regularization during training by adjusting the learning rate of each parameter individually. We empirically show in particular, that such a structure, using the Fourier transform, outperforms equivalent models without orthogonality support.

Orthogonal Transforms in Neural Networks Amount to Effective Regularization

TL;DR

Abstract

Paper Structure (15 sections, 18 equations, 4 figures, 3 tables)

This paper contains 15 sections, 18 equations, 4 figures, 3 tables.

Introduction
Dual-Orthogonal Neural Network
Dual-Orthogonal Block
Frequency-Supported Block
$N$-Step Ahead Prediction
Theoretical Properties
Orthogonal Branch as Universal Approximator
Gradients in Orthogonal Branch
Numerical Experiments
Hyperparameter Search
Evaluation
Static System with Frequency Input
Wiener-Hammerstein Benchmark
Silverbox Benchmark
Discussion

Figures (4)

Figure 1: Schematic representation of a dual-orthogonal block structure
Figure 2: Simulation error computed for best FSNN model on the test dataset for a static affine system with frequency input
Figure 3: Simulation error computed for best FSNN model on the test dataset for Wiener-Hammerstein benchmark
Figure 4: Simulation error computed for best FSNN model on the test dataset for Silverbox benchmark

Orthogonal Transforms in Neural Networks Amount to Effective Regularization

TL;DR

Abstract

Orthogonal Transforms in Neural Networks Amount to Effective Regularization

Authors

TL;DR

Abstract

Table of Contents

Figures (4)