Table of Contents
Fetching ...

Grids Often Outperform Implicit Neural Representation at Compressing Dense Signals

Namhoon Kim, Sara Fridovich-Keil

TL;DR

This work systematically benchmarked grid based interpolation, implicit neural representations, and hybrid approaches for compressing dense 2D and 3D signals across synthetic and real data. By stratifying by bandwidth and model size, the study reveals that simple interpolated grids consistently offer faster training and higher quality reconstructions for most tasks, challenging the notion that INRs are universally superior. INRs and hybrids occasionally outperform grids when the signal exhibits lower dimensional structure such as sharp edges or constant regions, pointing to targeted use cases for current INR designs. Overall, the findings advocate a practical strategy where grids serve as the default baseline for dense signals while INRs are reserved for specific scenarios with underlying geometric simplicity, with implications for methodology development and deployment in computational imaging and sensing.

Abstract

Implicit Neural Representations (INRs) have recently shown impressive results, but their fundamental capacity, implicit biases, and scaling behavior remain poorly understood. We investigate the performance of diverse INRs across a suite of 2D and 3D real and synthetic signals with varying effective bandwidth, as well as both overfitting and generalization tasks including tomography, super-resolution, and denoising. By stratifying performance according to model size as well as signal type and bandwidth, our results shed light on how different INR and grid representations allocate their capacity. We find that, for most tasks and signals, a simple regularized grid with interpolation trains faster and to higher quality than any INR with the same number of parameters. We also find limited settings--namely fitting binary signals such as shape contours--where INRs outperform grids, to guide future development and use of INRs towards the most advantageous applications.

Grids Often Outperform Implicit Neural Representation at Compressing Dense Signals

TL;DR

This work systematically benchmarked grid based interpolation, implicit neural representations, and hybrid approaches for compressing dense 2D and 3D signals across synthetic and real data. By stratifying by bandwidth and model size, the study reveals that simple interpolated grids consistently offer faster training and higher quality reconstructions for most tasks, challenging the notion that INRs are universally superior. INRs and hybrids occasionally outperform grids when the signal exhibits lower dimensional structure such as sharp edges or constant regions, pointing to targeted use cases for current INR designs. Overall, the findings advocate a practical strategy where grids serve as the default baseline for dense signals while INRs are reserved for specific scenarios with underlying geometric simplicity, with implications for methodology development and deployment in computational imaging and sensing.

Abstract

Implicit Neural Representations (INRs) have recently shown impressive results, but their fundamental capacity, implicit biases, and scaling behavior remain poorly understood. We investigate the performance of diverse INRs across a suite of 2D and 3D real and synthetic signals with varying effective bandwidth, as well as both overfitting and generalization tasks including tomography, super-resolution, and denoising. By stratifying performance according to model size as well as signal type and bandwidth, our results shed light on how different INR and grid representations allocate their capacity. We find that, for most tasks and signals, a simple regularized grid with interpolation trains faster and to higher quality than any INR with the same number of parameters. We also find limited settings--namely fitting binary signals such as shape contours--where INRs outperform grids, to guide future development and use of INRs towards the most advantageous applications.

Paper Structure

This paper contains 44 sections, 10 equations, 26 figures, 25 tables.

Figures (26)

  • Figure 1: Synthetic Signals. Rows represent synthetic signal types, and columns represent effective bandlimits from 0.1 to 0.9. Signal detail and complexity increase with effective bandlimit.
  • Figure 2: Qualitative overfitting results. Visualizations of each model on each overfitting task with $1\times10^4$ parameters, roughly $1\%$ of the pixels/voxels in the original 2D and 3D signals. For 3D signals, a slice is visualized. For synthetic signals with a bandwidth parameter, bandwidth 0.5 is shown. GSplat is restricted to 2D signals. Full visualizations varying model size and signal bandwidth are provided in \ref{['app:signal_overfitting']}. Different parameterizations induce different characteristic qualitative compression artifacts.
  • Figure 3: Overfitting Synthetic Signals: INR and INR - Grid Heatmaps. Red indicates regimes where other models outperform the Grid baseline; blue indicates regimes dominated by the Grid baseline. Each first row shows absolute PSNR values, while each second row shows the PSNR gap relative to the Grid baseline (i.e., PSNR - Grid PSNR). See \ref{['sec:overfitting_results']} for in-depth discussion.
  • Figure 4: Overfitting Capacity and Computational Efficiency. (a) Overfitting capacity of different models evaluated on the 2D DIV2K and 3D Stanford Dragon datasets. PSNR trends indicate how well each model fits the training data as a function of model size (top) as well as relative performance compared to the Grid baseline (bottom). (b) Computational efficiency analysis, showing inference and training times for each model. While WIRE achieves superior overfitting capacity in some cases, it requires significantly more computation time---approximately $10\times$ that of the next slowest model.
  • Figure 5: Qualitative Generalization and Inverse Problems Results. In CT reconstruction, Grid with TV regularization achieves the best results. Experiments on the DIV2K dataset are zoomed in to highlight image details. For image denoising and super-resolution, WIRE produces sharp images but introduces texture artifacts. GA-Planes outperforms other methods in volume and surface super-resolution. Artifacts in each model’s output throughout the inverse problem tasks reveal their inherent structural biases (e.g., sinusoidal artifacts in FFNs and SIREN, line artifacts in GA-Planes).
  • ...and 21 more figures