Prior-Informed Neural Network Initialization: A Spectral Approach for Function Parameterizing Architectures

David Orlando Salazar Torres; Diyar Altinses; Andreas Schwung

Prior-Informed Neural Network Initialization: A Spectral Approach for Function Parameterizing Architectures

David Orlando Salazar Torres, Diyar Altinses, Andreas Schwung

Abstract

Neural network architectures designed for function parameterization, such as the Bag-of-Functions (BoF) framework, bridge the gap between the expressivity of deep learning and the interpretability of classical signal processing. However, these models are inherently sensitive to parameter initialization, as traditional data-agnostic schemes fail to capture the structural properties of the target signals, often leading to suboptimal convergence. In this work, we propose a prior-informed design strategy that leverages the intrinsic spectral and temporal structure of the data to guide both network initialization and architectural configuration. A principled methodology is introduced that uses the Fast Fourier Transform to extract dominant seasonal priors, informing model depth and initial states, and a residual-based regression approach to parameterize trend components. Crucially, this structural alignment enables a substantial reduction in encoder dimensionality without compromising reconstruction fidelity. A supporting theoretical analysis provides guidance on trend estimation under finite-sample regimes. Extensive experiments on synthetic and real-world benchmarks demonstrate that embedding data-driven priors significantly accelerates convergence, reduces performance variability across trials, and improves computational efficiency. Overall, the proposed framework enables more compact and interpretable architectures while outperforming standard initialization baselines, without altering the core training procedure.

Prior-Informed Neural Network Initialization: A Spectral Approach for Function Parameterizing Architectures

Abstract

Paper Structure (25 sections, 1 theorem, 39 equations, 10 figures, 5 tables, 1 algorithm)

This paper contains 25 sections, 1 theorem, 39 equations, 10 figures, 5 tables, 1 algorithm.

Introduction
Related Work
Function Parameterization
Informed Data-driven Initialization
Background and Problem Formulation
Informed Initialization Framework
Estimating Seasonal Parameters via Fourier Spectrum
Spectral Modeling and Frequency Selection
Spectral Energy Ratio as a Diagnostic Metric
Sample Size Requirements for Linear Regression in Noisy Time Series
Model Setup
Estimation Error of the Slope
Concentration Bounds and Finite-Sample Guarantees
Bias Estimation
Implications for Architecture Design
...and 10 more sections

Key Result

Theorem 6.1

Under assump, for any tolerance level $\delta > 0$,

Figures (10)

Figure 1: Overview of the proposed data-informed framework. Extracted seasonal and trend statistics serve as priors that jointly determine the architectural topology, including stage depth and input dimensions, and initialize the Bag-of-Functions encoder weights to align the model with the data structure before training.
Figure 2: Spectral analysis of the synthetic dataset used for prior extraction. The average periodogram identifies dominant modes at $3.48$ Hz, $7.56$ Hz, and $12.29$ Hz, complemented by the spectral heatmap across all $N=2,000$ samples. The cumulative energy ratio reaches $\rho_{\text{spec}} = 0.873$ by combining these three identified modes.
Figure 3: Visual validation of the trend estimation on two dataset samples. The plots display the input signal (gray), the deseasonalized input (blue), and the ground truth trend (green dotted). The red markers indicate the specific observations selected for the OLS regression, yielding the estimated trend (red dashed).
Figure 4: Spectral analysis of the PJM Hourly dataset used for prior extraction. The average periodogram identifies dominant modes at $6.97$ and $13.99$ cycles/week, while the accompanying spectral heatmap displays the energy distribution across all 156 samples. The cumulative energy ratio reaches $\rho_{\text{spec}} = 0.960$ by combining the two identified dominant modes.
Figure 5: Spectral analysis of the Thermal Power Plant dataset used for prior extraction. The average periodogram identifies distributed dominant modes at $1.42$, $6.95$, $14.00$, and $21.20$ cycles/week, complemented by the spectral heatmap across all 1000 samples. The cumulative energy ratio reaches $\rho_{\text{spec}} = 0.991$ by combining the four identified modes.
...and 5 more figures

Theorems & Definitions (2)

Theorem 6.1
proof

Prior-Informed Neural Network Initialization: A Spectral Approach for Function Parameterizing Architectures

Abstract

Prior-Informed Neural Network Initialization: A Spectral Approach for Function Parameterizing Architectures

Authors

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (2)