Table of Contents
Fetching ...

Exploring Kernel Transformations for Implicit Neural Representations

Sheng Zheng, Chaoning Zhang, Dongshen Han, Fachrina Dewi Puspitasari, Xinhong Hao, Yang Yang, Heng Tao Shen

TL;DR

The paper investigates kernel transformations applied to the input and output of implicit neural representations (INRs) as an alternative to modifying internal model components. It finds that nonlinear kernels degrade performance, while linear transformations—specifically input scaling and adaptive output shifting—consistently improve accuracy, motivating the SS-INR framework. Across image fitting, CT reconstruction, and audio representation, SS-INR yields gains for multiple INR backbones, indicating the approach's robustness and practical value. The work also offers depth- and normalization-based interpretations to explain why these simple I/O transformations help, and it opens avenues for future exploration of dynamic kernel transformations in INRs.

Abstract

Implicit neural representations (INRs), which leverage neural networks to represent signals by mapping coordinates to their corresponding attributes, have garnered significant attention. They are extensively utilized for image representation, with pixel coordinates as input and pixel values as output. In contrast to prior works focusing on investigating the effect of the model's inside components (activation function, for instance), this work pioneers the exploration of the effect of kernel transformation of input/output while keeping the model itself unchanged. A byproduct of our findings is a simple yet effective method that combines scale and shift to significantly boost INR with negligible computation overhead. Moreover, we present two perspectives, depth and normalization, to interpret the performance benefits caused by scale and shift transformation. Overall, our work provides a new avenue for future works to understand and improve INR through the lens of kernel transformation.

Exploring Kernel Transformations for Implicit Neural Representations

TL;DR

The paper investigates kernel transformations applied to the input and output of implicit neural representations (INRs) as an alternative to modifying internal model components. It finds that nonlinear kernels degrade performance, while linear transformations—specifically input scaling and adaptive output shifting—consistently improve accuracy, motivating the SS-INR framework. Across image fitting, CT reconstruction, and audio representation, SS-INR yields gains for multiple INR backbones, indicating the approach's robustness and practical value. The work also offers depth- and normalization-based interpretations to explain why these simple I/O transformations help, and it opens avenues for future exploration of dynamic kernel transformations in INRs.

Abstract

Implicit neural representations (INRs), which leverage neural networks to represent signals by mapping coordinates to their corresponding attributes, have garnered significant attention. They are extensively utilized for image representation, with pixel coordinates as input and pixel values as output. In contrast to prior works focusing on investigating the effect of the model's inside components (activation function, for instance), this work pioneers the exploration of the effect of kernel transformation of input/output while keeping the model itself unchanged. A byproduct of our findings is a simple yet effective method that combines scale and shift to significantly boost INR with negligible computation overhead. Moreover, we present two perspectives, depth and normalization, to interpret the performance benefits caused by scale and shift transformation. Overall, our work provides a new avenue for future works to understand and improve INR through the lens of kernel transformation.

Paper Structure

This paper contains 16 sections, 7 equations, 7 figures, 11 tables.

Figures (7)

  • Figure 1: Overview of kernel transformation in INR for image representation. Prior works primarily focus on the effect of the model's internal components, while our work shifts the attention to the model's Input/Output by applying kernel transformations.
  • Figure 2: Count of optimal scale factors for 24 studied images.
  • Figure 3: Effect of scale factors on the INR performance when a scale transformation is applied to the model's input.
  • Figure 4: The relationship between the distance and corresponding PSNR/SSIM for shift transformation on the model's output. This distance is defined as the difference between the applied shift value and the average value of all target images in the dataset.
  • Figure 5: Qualitative comparison of vanilla INR backbones with and without SS modules on image fitting. The clothing on the sculpture within the red rectangle provides a more apparent comparison of the clarity of the visual representation of each method. Compared to the vanilla INR backbones, those with SS modules are better at capturing fine details.
  • ...and 2 more figures