Table of Contents
Fetching ...

Towards Croppable Implicit Neural Representations

Maor Ashkenazi, Eran Treister

TL;DR

This paper presents Local-Global SIRENs -- a novel INR architecture that supports cropping by design and examines how the Local-Global approach can accelerate training, enhance encoding of various signals, improve downstream performance, and be applied to modern INRs such as INCODE, highlighting its potential and flexibility.

Abstract

Implicit Neural Representations (INRs) have peaked interest in recent years due to their ability to encode natural signals using neural networks. While INRs allow for useful applications such as interpolating new coordinates and signal compression, their black-box nature makes it difficult to modify them post-training. In this paper we explore the idea of editable INRs, and specifically focus on the widely used cropping operation. To this end, we present Local-Global SIRENs -- a novel INR architecture that supports cropping by design. Local-Global SIRENs are based on combining local and global feature extraction for signal encoding. What makes their design unique is the ability to effortlessly remove specific portions of an encoded signal, with a proportional weight decrease. This is achieved by eliminating the corresponding weights from the network, without the need for retraining. We further show how this architecture can be used to support the straightforward extension of previously encoded signals. Beyond signal editing, we examine how the Local-Global approach can accelerate training, enhance encoding of various signals, improve downstream performance, and be applied to modern INRs such as INCODE, highlighting its potential and flexibility. Code is available at https://github.com/maorash/Local-Global-INRs.

Towards Croppable Implicit Neural Representations

TL;DR

This paper presents Local-Global SIRENs -- a novel INR architecture that supports cropping by design and examines how the Local-Global approach can accelerate training, enhance encoding of various signals, improve downstream performance, and be applied to modern INRs such as INCODE, highlighting its potential and flexibility.

Abstract

Implicit Neural Representations (INRs) have peaked interest in recent years due to their ability to encode natural signals using neural networks. While INRs allow for useful applications such as interpolating new coordinates and signal compression, their black-box nature makes it difficult to modify them post-training. In this paper we explore the idea of editable INRs, and specifically focus on the widely used cropping operation. To this end, we present Local-Global SIRENs -- a novel INR architecture that supports cropping by design. Local-Global SIRENs are based on combining local and global feature extraction for signal encoding. What makes their design unique is the ability to effortlessly remove specific portions of an encoded signal, with a proportional weight decrease. This is achieved by eliminating the corresponding weights from the network, without the need for retraining. We further show how this architecture can be used to support the straightforward extension of previously encoded signals. Beyond signal editing, we examine how the Local-Global approach can accelerate training, enhance encoding of various signals, improve downstream performance, and be applied to modern INRs such as INCODE, highlighting its potential and flexibility. Code is available at https://github.com/maorash/Local-Global-INRs.
Paper Structure (35 sections, 3 equations, 13 figures, 15 tables, 1 algorithm)

This paper contains 35 sections, 3 equations, 13 figures, 15 tables, 1 algorithm.

Figures (13)

  • Figure 1: Examples of cropping a Local-Global SIREN with 199k parameters. Plot on the right shows the number of parameters as a function of cropped partitions in the encoded image.
  • Figure 2: Example of partitioning an image. This trivial example uses $C_0=3, C_1=2$. To achieve flexible cropping, one must choose larger partition factors.
  • Figure 3: Illustration the Local-Global SIREN architecture and inference flow for two image coordinates $p_0, p_1$. The coordinates are passed through (a) the global sub-network and their partition's corresponding (b) local sub-network. Note that the coordinates are distinct elements in a batch, meaning that a coordinate's local features are only merged with the same coordinate's global features.
  • Figure 4: Encoded images throughout training iterations. PSNR values are at the top left of each image. Method names are on the left. Notice the artifacts in the SIREN-per-Partition method and the reduced noise in our approach compared to SIREN. For extended qualitative results of various signals and cropping operations, refer to https://sites.google.com/view/local-global-inrs.
  • Figure 5: Encoded Bach audio clips. Mean PSNR values using 10 random seeds are on the top left of each figure.
  • ...and 8 more figures