Table of Contents
Fetching ...

In Search of a Data Transformation That Accelerates Neural Field Training

Junwon Seo, Sangyoon Lee, Kwang In Kim, Jaeho Lee

TL;DR

This paper dives into the impacts of data transformations on the speed of neural field training, specifically focusing on how permuting pixel locations affect the convergence speed of SGD, and finds that randomly permuting the pixel locations can considerably accelerate the training.

Abstract

Neural field is an emerging paradigm in data representation that trains a neural network to approximate the given signal. A key obstacle that prevents its widespread adoption is the encoding speed-generating neural fields requires an overfitting of a neural network, which can take a significant number of SGD steps to reach the desired fidelity level. In this paper, we delve into the impacts of data transformations on the speed of neural field training, specifically focusing on how permuting pixel locations affect the convergence speed of SGD. Counterintuitively, we find that randomly permuting the pixel locations can considerably accelerate the training. To explain this phenomenon, we examine the neural field training through the lens of PSNR curves, loss landscapes, and error patterns. Our analyses suggest that the random pixel permutations remove the easy-to-fit patterns, which facilitate easy optimization in the early stage but hinder capturing fine details of the signal.

In Search of a Data Transformation That Accelerates Neural Field Training

TL;DR

This paper dives into the impacts of data transformations on the speed of neural field training, specifically focusing on how permuting pixel locations affect the convergence speed of SGD, and finds that randomly permuting the pixel locations can considerably accelerate the training.

Abstract

Neural field is an emerging paradigm in data representation that trains a neural network to approximate the given signal. A key obstacle that prevents its widespread adoption is the encoding speed-generating neural fields requires an overfitting of a neural network, which can take a significant number of SGD steps to reach the desired fidelity level. In this paper, we delve into the impacts of data transformations on the speed of neural field training, specifically focusing on how permuting pixel locations affect the convergence speed of SGD. Counterintuitively, we find that randomly permuting the pixel locations can considerably accelerate the training. To explain this phenomenon, we examine the neural field training through the lens of PSNR curves, loss landscapes, and error patterns. Our analyses suggest that the random pixel permutations remove the easy-to-fit patterns, which facilitate easy optimization in the early stage but hinder capturing fine details of the signal.
Paper Structure (23 sections, 6 equations, 22 figures, 6 tables)

This paper contains 23 sections, 6 equations, 22 figures, 6 tables.

Figures (22)

  • Figure 1: Overall pipeline. We consider a three-step procedure to train neural fields. (1) Apply a data transformation to the target datum. (2) Train a neural field to fit the transformed data. (3) Reconstruct the original data by generating the transformed datum from the neural field, and then applying the inverse of the data transformation. By selecting the data transformation carefully, we can significantly reduce the computational cost required to train a neural field that achieves the desired quality of approximation.
  • Figure 2: Data transformations considered. In each subfigure, we visualize the data transformations by illustrating how the transformed image (left) and the intensity histogram (right) looks like on a Kodak image.
  • Figure 3: Frequency spectra of original vs. $\mathtt{RPP}$. We compare the average DCT coefficients of the original and $\mathtt{RPP}$ Kodak images. Upper left region denotes the low-frequency, and the lower right region denotes the high-frequency.
  • Figure 4: PSNR curves for a single Kodak image. The original image (orange) excels during the early stage of training but the $\mathtt{RPP}$ image quickly reaches PSNR 50 in the middle of the training.
  • Figure 5: SIREN loss landscape: from initial point to 30dB point. For original and $\mathtt{RPP}$ versions of a Kodak image, we plot the loss landscape between the initial point and the parameter that achieves PSNR 30. For $\mathtt{RPP}$, the minima is much narrower and lacks a clear pathway toward it, unlike in the original image.
  • ...and 17 more figures