Table of Contents
Fetching ...

Orthogonal Transforms in Neural Networks Amount to Effective Regularization

Krzysztof Zając, Wojciech Sopot, Paweł Wachel

TL;DR

The paper tackles nonlinear system identification by embedding frequency-domain inductive bias into neural networks via a dual-branch architecture that pairs a time-domain path with an orthogonal-transform path, exemplified by a Fourier transform. It proves that both branches are universal approximators and shows that the orthogonal path induces per-parameter gradient scaling, effectively regularizing training. Empirically, the Fourier-based FSNN outperforms non-orthogonal baselines on static frequency-input tasks and provides competitive results on Wiener-Hammerstein and Silverbox benchmarks, illustrating the practical benefits and limitations of the approach. Overall, the work demonstrates that orthogonal transforms can expand neural network capabilities for physical system identification while offering a principled regularization mechanism and a route to task-specialized models.

Abstract

We consider applications of neural networks in nonlinear system identification and formulate a hypothesis that adjusting general network structure by incorporating frequency information or other known orthogonal transform, should result in an efficient neural network retaining its universal properties. We show that such a structure is a universal approximator and that using any orthogonal transform in a proposed way implies regularization during training by adjusting the learning rate of each parameter individually. We empirically show in particular, that such a structure, using the Fourier transform, outperforms equivalent models without orthogonality support.

Orthogonal Transforms in Neural Networks Amount to Effective Regularization

TL;DR

The paper tackles nonlinear system identification by embedding frequency-domain inductive bias into neural networks via a dual-branch architecture that pairs a time-domain path with an orthogonal-transform path, exemplified by a Fourier transform. It proves that both branches are universal approximators and shows that the orthogonal path induces per-parameter gradient scaling, effectively regularizing training. Empirically, the Fourier-based FSNN outperforms non-orthogonal baselines on static frequency-input tasks and provides competitive results on Wiener-Hammerstein and Silverbox benchmarks, illustrating the practical benefits and limitations of the approach. Overall, the work demonstrates that orthogonal transforms can expand neural network capabilities for physical system identification while offering a principled regularization mechanism and a route to task-specialized models.

Abstract

We consider applications of neural networks in nonlinear system identification and formulate a hypothesis that adjusting general network structure by incorporating frequency information or other known orthogonal transform, should result in an efficient neural network retaining its universal properties. We show that such a structure is a universal approximator and that using any orthogonal transform in a proposed way implies regularization during training by adjusting the learning rate of each parameter individually. We empirically show in particular, that such a structure, using the Fourier transform, outperforms equivalent models without orthogonality support.
Paper Structure (15 sections, 18 equations, 4 figures, 3 tables)

This paper contains 15 sections, 18 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Schematic representation of a dual-orthogonal block structure
  • Figure 2: Simulation error computed for best FSNN model on the test dataset for a static affine system with frequency input
  • Figure 3: Simulation error computed for best FSNN model on the test dataset for Wiener-Hammerstein benchmark
  • Figure 4: Simulation error computed for best FSNN model on the test dataset for Silverbox benchmark