Orthogonal Transforms in Neural Networks Amount to Effective Regularization
Krzysztof Zając, Wojciech Sopot, Paweł Wachel
TL;DR
The paper tackles nonlinear system identification by embedding frequency-domain inductive bias into neural networks via a dual-branch architecture that pairs a time-domain path with an orthogonal-transform path, exemplified by a Fourier transform. It proves that both branches are universal approximators and shows that the orthogonal path induces per-parameter gradient scaling, effectively regularizing training. Empirically, the Fourier-based FSNN outperforms non-orthogonal baselines on static frequency-input tasks and provides competitive results on Wiener-Hammerstein and Silverbox benchmarks, illustrating the practical benefits and limitations of the approach. Overall, the work demonstrates that orthogonal transforms can expand neural network capabilities for physical system identification while offering a principled regularization mechanism and a route to task-specialized models.
Abstract
We consider applications of neural networks in nonlinear system identification and formulate a hypothesis that adjusting general network structure by incorporating frequency information or other known orthogonal transform, should result in an efficient neural network retaining its universal properties. We show that such a structure is a universal approximator and that using any orthogonal transform in a proposed way implies regularization during training by adjusting the learning rate of each parameter individually. We empirically show in particular, that such a structure, using the Fourier transform, outperforms equivalent models without orthogonality support.
