Table of Contents
Fetching ...

Fast Training of Sinusoidal Neural Fields via Scaling Initialization

Taesun Yeom, Sangyoon Lee, Jaeho Lee

TL;DR

This work tackles the slow training of sinusoidal neural fields (SNFs) by introducing weight scaling (WS), a simple initialization that multiplies non-final layer weights by a factor $\alpha \ge 1$. Through Fourier/Bessel analyses and empirical neural tangent kernel (eNTK) studies, WS is shown to increase higher-frequency content and yield a better-conditioned optimization path, enabling up to $10\times$ faster convergence across diverse data domains while preserving generalization at moderate $\alpha$. The authors compare WS against standard SNF initialization and multiple baselines, demonstrating robust speedups and practical guidance for selecting $\alpha$ based on workload structure. The results advocate rethinking neural-field initialization as a critical lever for efficiency, with implications for broader NF architectures and activation families. Limitations include the focus on periodic activations and the absence of formal convergence guarantees, pointing to future work on theoretical analysis and extension to other NF paradigms.

Abstract

Neural fields are an emerging paradigm that represent data as continuous functions parameterized by neural networks. Despite many advantages, neural fields often have a high training cost, which prevents a broader adoption. In this paper, we focus on a popular family of neural fields, called sinusoidal neural fields (SNFs), and study how it should be initialized to maximize the training speed. We find that the standard initialization scheme for SNFs -- designed based on the signal propagation principle -- is suboptimal. In particular, we show that by simply multiplying each weight (except for the last layer) by a constant, we can accelerate SNF training by 10$\times$. This method, coined $\textit{weight scaling}$, consistently provides a significant speedup over various data domains, allowing the SNFs to train faster than more recently proposed architectures. To understand why the weight scaling works well, we conduct extensive theoretical and empirical analyses which reveal that the weight scaling not only resolves the spectral bias quite effectively but also enjoys a well-conditioned optimization trajectory.

Fast Training of Sinusoidal Neural Fields via Scaling Initialization

TL;DR

This work tackles the slow training of sinusoidal neural fields (SNFs) by introducing weight scaling (WS), a simple initialization that multiplies non-final layer weights by a factor . Through Fourier/Bessel analyses and empirical neural tangent kernel (eNTK) studies, WS is shown to increase higher-frequency content and yield a better-conditioned optimization path, enabling up to faster convergence across diverse data domains while preserving generalization at moderate . The authors compare WS against standard SNF initialization and multiple baselines, demonstrating robust speedups and practical guidance for selecting based on workload structure. The results advocate rethinking neural-field initialization as a critical lever for efficiency, with implications for broader NF architectures and activation families. Limitations include the focus on periodic activations and the absence of formal convergence guarantees, pointing to future work on theoretical analysis and extension to other NF paradigms.

Abstract

Neural fields are an emerging paradigm that represent data as continuous functions parameterized by neural networks. Despite many advantages, neural fields often have a high training cost, which prevents a broader adoption. In this paper, we focus on a popular family of neural fields, called sinusoidal neural fields (SNFs), and study how it should be initialized to maximize the training speed. We find that the standard initialization scheme for SNFs -- designed based on the signal propagation principle -- is suboptimal. In particular, we show that by simply multiplying each weight (except for the last layer) by a constant, we can accelerate SNF training by 10. This method, coined , consistently provides a significant speedup over various data domains, allowing the SNFs to train faster than more recently proposed architectures. To understand why the weight scaling works well, we conduct extensive theoretical and empirical analyses which reveal that the weight scaling not only resolves the spectral bias quite effectively but also enjoys a well-conditioned optimization trajectory.
Paper Structure (42 sections, 7 theorems, 41 equations, 23 figures, 4 tables)

This paper contains 42 sections, 7 theorems, 41 equations, 23 figures, 4 tables.

Key Result

Proposition 1

Consider an $l$-layer SNF initialized with the weight scaling (eq:ws_init). For any $\alpha \ge 1$, we have, in an approximate sense,

Figures (23)

  • Figure 1: A simple weight scaling accelerates training. The proposed weight scaling scales up the initial weights of an SNF by the factor of $\alpha$, except for the last layer (left panel). The weight scaled SNF significantly speeds up training across a variety of methods (right panel: train PSNR curve for a single Kodak image).
  • Figure 2: Scaling factors and the speed-generalization tradeoff. (a) As we increase the scaling factor $\alpha$, the training speed ($\circ$) tends to become faster while the interpolation performance ($\circ$) gets lower. Notably, however, there exists some range of $\alpha$ where we enjoy acceleration with negligible degradation in test PSNR. (b) Comparing with the frequency tuning ($\square$), the weight scaling ($\circ$) achieves a better tradeoff. Further experimental details about the tradeoff curve is provided in \ref{['app:pareto']}.
  • Figure 3: Decoupling the effects of weight scaling on SNF. We decouple the effect of weight scaling on how it amplifies the early layer gradients, from the effects of having a higher-frequency initial functional. We observe that the initial functional itself plays the key role, resulting in a much greater acceleration.
  • Figure 4: Spectrum of initialized SNFs. 1D-FFT of an initialized 5-layer SNF, with various levels of weight scaling factors.
  • Figure 5: Eigenanalyses with eNTK. (a) Weight scaling improves the conditioning of the SNF optimization at all SGD steps, greatly reducing the condition number. (b) Weight-scaled SNF enjoys a better kernel-task alignment throughout the training; darker lines indicate later iterations.
  • ...and 18 more figures

Theorems & Definitions (7)

  • Proposition 1: Informal; extension of sitzmann2020implicit
  • Lemma 2: Corollary of yuce2022structured
  • Lemma 3: Scaling of harmonics
  • Lemma 4: Arcsin to Gaussian distributions
  • Lemma 5: Gaussian to arcsin distributions, with numerical proof
  • Lemma 6: From yuce2022structured
  • Theorem 7: Theorem 1 in mehta2020extreme.