Table of Contents
Fetching ...

Target noise: A pre-training based neural network initialization for efficient high resolution learning

Shaowen Wang, Tariq Alkhalifah

TL;DR

The paper addresses how initialization shapes optimization in deep networks and proposes a simple noise-based pretraining scheme that uses random noise as the target. By pretraining to fit random noise, the network weights become structured rather than purely random, yielding faster convergence and more stable early training. NTK analysis shows that this pretraining reshapes the Neural Tangent Kernel, flattening its spectrum and increasing locality to enable earlier high-frequency learning. Empirical results on coordinate-based representations and DIP-based tasks (denoising, super-resolution, inpainting) demonstrate faster convergence without extra data or architectural changes, suggesting a lightweight, general initialization strategy with broad potential impact.

Abstract

Weight initialization plays a crucial role in the optimization behavior and convergence efficiency of neural networks. Most existing initialization methods, such as Xavier and Kaiming initializations, rely on random sampling and do not exploit information from the optimization process itself. We propose a simple, yet effective, initialization strategy based on self-supervised pre-training using random noise as the target. Instead of directly training the network from random weights, we first pre-train it to fit random noise, which leads to a structured and non-random parameter configuration. We show that this noise-driven pre-training significantly improves convergence speed in subsequent tasks, without requiring additional data or changes to the network architecture. The proposed method is particularly effective for implicit neural representations (INRs) and Deep Image Prior (DIP)-style networks, which are known to exhibit a strong low-frequency bias during optimization. After noise-based pre-training, the network is able to capture high-frequency components much earlier in training, leading to faster and more stable convergence. Although random noise contains no semantic information, it serves as an effective self-supervised signal (considering its white spectrum nature) for shaping the initialization of neural networks. Overall, this work demonstrates that noise-based pre-training offers a lightweight and general alternative to traditional random initialization, enabling more efficient optimization of deep neural networks.

Target noise: A pre-training based neural network initialization for efficient high resolution learning

TL;DR

The paper addresses how initialization shapes optimization in deep networks and proposes a simple noise-based pretraining scheme that uses random noise as the target. By pretraining to fit random noise, the network weights become structured rather than purely random, yielding faster convergence and more stable early training. NTK analysis shows that this pretraining reshapes the Neural Tangent Kernel, flattening its spectrum and increasing locality to enable earlier high-frequency learning. Empirical results on coordinate-based representations and DIP-based tasks (denoising, super-resolution, inpainting) demonstrate faster convergence without extra data or architectural changes, suggesting a lightweight, general initialization strategy with broad potential impact.

Abstract

Weight initialization plays a crucial role in the optimization behavior and convergence efficiency of neural networks. Most existing initialization methods, such as Xavier and Kaiming initializations, rely on random sampling and do not exploit information from the optimization process itself. We propose a simple, yet effective, initialization strategy based on self-supervised pre-training using random noise as the target. Instead of directly training the network from random weights, we first pre-train it to fit random noise, which leads to a structured and non-random parameter configuration. We show that this noise-driven pre-training significantly improves convergence speed in subsequent tasks, without requiring additional data or changes to the network architecture. The proposed method is particularly effective for implicit neural representations (INRs) and Deep Image Prior (DIP)-style networks, which are known to exhibit a strong low-frequency bias during optimization. After noise-based pre-training, the network is able to capture high-frequency components much earlier in training, leading to faster and more stable convergence. Although random noise contains no semantic information, it serves as an effective self-supervised signal (considering its white spectrum nature) for shaping the initialization of neural networks. Overall, this work demonstrates that noise-based pre-training offers a lightweight and general alternative to traditional random initialization, enabling more efficient optimization of deep neural networks.
Paper Structure (11 sections, 9 equations, 12 figures)

This paper contains 11 sections, 9 equations, 12 figures.

Figures (12)

  • Figure 1: Comparison of the top eigenvectors of the NTK for a network initialized with standard random weights and a network initialized with noise as target self-supervised pretraining and the corresponding spectrum.
  • Figure 2: Comparison of the NTK eigenvalue for Siren having standard random weights initialization with noise-driven self-supervised pretraining.
  • Figure 3: Comparison of the NTK matrix based on Siren with standard random weights and with noise-driven self-supervised pretraining.
  • Figure 4: Ground truth image and comparison of the image representation results after 50 iterations of training with standard random initialization and with noise-driven self-supervised pretraining using in both cases the same network and training hyperparameters.
  • Figure 5: The loss and PSNR curves of the image representation task with standard random initialization and with noise-driven self-supervised pretraining.
  • ...and 7 more figures