Table of Contents
Fetching ...

Tuning the Frequencies: Robust Training for Sinusoidal Neural Networks

Tiago Novello, Diana Aldana, Andre Araujo, Luiz Velho

TL;DR

The paper presents TUNER, a theory-grounded training approach for sinusoidal INRs that combines a novel amplitude-phase expansion with robust spectral control. By showing that layer compositions generate many frequencies as integer combinations of input frequencies, it enables a spectral-sampling initialization and a principled bandlimit bound during training. The method yields faster, more stable convergence and improved gradient reconstructions compared to baselines like SIREN and BACON, while reducing ringing artifacts through a soft spectral filter mechanism. These contributions advance the practical reliability and expressiveness of sinusoidal MLPs for implicit representations of signals. The work paves the way for extending frequency-controlled INR training to deeper nets and broader signal domains.

Abstract

Sinusoidal neural networks have been shown effective as implicit neural representations (INRs) of low-dimensional signals, due to their smoothness and high representation capacity. However, initializing and training them remain empirical tasks which lack on deeper understanding to guide the learning process. To fill this gap, our work introduces a theoretical framework that explains the capacity property of sinusoidal networks and offers robust control mechanisms for initialization and training. Our analysis is based on a novel amplitude-phase expansion of the sinusoidal multilayer perceptron, showing how its layer compositions produce a large number of new frequencies expressed as integer combinations of the input frequencies. This relationship can be directly used to initialize the input neurons, as a form of spectral sampling, and to bound the network's spectrum while training. Our method, referred to as TUNER (TUNing sinusoidal nEtwoRks), greatly improves the stability and convergence of sinusoidal INR training, leading to detailed reconstructions, while preventing overfitting.

Tuning the Frequencies: Robust Training for Sinusoidal Neural Networks

TL;DR

The paper presents TUNER, a theory-grounded training approach for sinusoidal INRs that combines a novel amplitude-phase expansion with robust spectral control. By showing that layer compositions generate many frequencies as integer combinations of input frequencies, it enables a spectral-sampling initialization and a principled bandlimit bound during training. The method yields faster, more stable convergence and improved gradient reconstructions compared to baselines like SIREN and BACON, while reducing ringing artifacts through a soft spectral filter mechanism. These contributions advance the practical reliability and expressiveness of sinusoidal MLPs for implicit representations of signals. The work paves the way for extending frequency-controlled INR training to deeper nets and broader signal domains.

Abstract

Sinusoidal neural networks have been shown effective as implicit neural representations (INRs) of low-dimensional signals, due to their smoothness and high representation capacity. However, initializing and training them remain empirical tasks which lack on deeper understanding to guide the learning process. To fill this gap, our work introduces a theoretical framework that explains the capacity property of sinusoidal networks and offers robust control mechanisms for initialization and training. Our analysis is based on a novel amplitude-phase expansion of the sinusoidal multilayer perceptron, showing how its layer compositions produce a large number of new frequencies expressed as integer combinations of the input frequencies. This relationship can be directly used to initialize the input neurons, as a form of spectral sampling, and to bound the network's spectrum while training. Our method, referred to as TUNER (TUNing sinusoidal nEtwoRks), greatly improves the stability and convergence of sinusoidal INR training, leading to detailed reconstructions, while preventing overfitting.
Paper Structure (13 sections, 2 theorems, 8 equations, 13 figures, 3 tables)

This paper contains 13 sections, 2 theorems, 8 equations, 13 figures, 3 tables.

Key Result

Theorem 1

Each hidden neuron $h_i$ of a 3-layer sinusoidal MLP has an amplitude-phase expansion of the form where $\beta_{\textbf{k}} \!=\! \left\langle\textbf{k},\omega\right\rangle$, $\lambda_{\textbf{k}}\!=\!\left\langle\textbf{k},\varphi\right\rangle \!+\! b_i$, and $\alpha_{\textbf{k}} \!=\! \prod_j\!J_{k_j}(W_{ij})$ is the product of the Bessel functions of the first kind.

Figures (13)

  • Figure 1: We present TUNER, a robust and theoretically grounded training technique for sinusoidal MLPs, overcoming challenges in initialization and enabling bandlimiting control. Our experiments showcase TUNER's strong initialization results against ReLU, FFM tancik2020fourier, and SIREN sitzmann2020implicit (top), where all models use the same size and training conditions. TUNER achieves both fast and stable convergence (bottom-left) while reconstructing gradients without noise. We also compare with BACON lindell2022bacon across two bandlimits (bottom-right), enhancing quality and avoiding ringing artifacts.
  • Figure 2: Overview of TUNER. To train a sinusoidal MLP (gray model, top-left), we employ two techniques derived from Thrms \ref{['thm: main']} and \ref{['thm: bound']}. First, we initialize the input frequencies $\omega$ (green, bottom-left) with a dense distribution of low frequencies (red square) and a sparse distribution of higher frequencies (green grid). This initialization gives flexibility to learn the remaining signal frequencies which are simply integer combinations of $\omega$ (the yellow nodes on the right), a consequence of the amplitude-phase expansion given by Theorem \ref{['thm: main']}. Note that this initialization resembles a frequency sampling since the training generates those new frequencies around $\omega$. Second, we bound the coefficients of the hidden layer weights (blue nodes) to ensure that the MLP remains within a specified bandlimit. This approach is effective because the amplitude-phase expansion (shown on the right) of each hidden neuron (purple nodes) indicates that the amplitudes of the generated frequencies have an upper-bound depending only on the hidden weights (blue, bottom-right).
  • Figure 3: Choosing $\omega$ as the cartesian product of the odd frequencies without (left) or with (right) the frequencies $(1,0), (0,1)$. Note that adding them prevents sub-periods (see Supp. Mat.). We trained for $3000$ epochs on a $256^2$ image with network parameters $m=72$, $n=512$, and $\mathscr{b}=30$.
  • Figure 4: Uniform init. of $\omega$ (top) and ours (bottom). Grayscale bands show INR's gradient. Ours offers better signal/gradient reconstruction w/o gradient supervision. The MLPs $(m\!=\!128, n\!=\!256)$ with bandlimit $\mathscr{b}\!=\!87$ were trained for $3000$ epochs.
  • Figure 5: Image reconstructions with bandlimit $B\!\!=\!\!128$, varying $\mathscr{b}\!=\!4,43,128$ with a network ($m\!=\!80, n\!=\!1000$) trained over 3000 epochs. The gradient magnitude is shown on the left of each image. Note that smaller $\mathscr{b}$ (middle-left) yield better reconstructions, while higher values of $\mathscr{b}$ introduce noisy gradients (right).
  • ...and 8 more figures

Theorems & Definitions (2)

  • Theorem 1
  • Theorem 2