Table of Contents
Fetching ...

Latent Modulated Function for Computational Optimal Continuous Image Representation

Zongyao He, Zhi Jin

TL;DR

This work tackles the high computational cost of decoding in High-Resolution High-Dimensional space inherent to INR-based Arbitrary-Scale Super-Resolution. It introduces Latent Modulated Function (LMF), which decouples decoding into a shared LR-HD latent path and a lightweight HR-LD render path, using a latent MLP to generate modulation that conditionally guides a small render MLP. A Controllable Multi-Scale Rendering (CMSR) strategy ties modulation strength to input complexity, enabling test-time trade-offs between precision and speed. Across multiple encoders and benchmarks, LMF achieves substantial reductions in MACs, runtime, and parameter count while maintaining competitive PSNR, demonstrating practical, scalable continuous image representation. The approach is compatible with existing INR-based ASSR methods and offers a path toward efficient, arbitrary-resolution rendering in real-world applications.

Abstract

The recent work Local Implicit Image Function (LIIF) and subsequent Implicit Neural Representation (INR) based works have achieved remarkable success in Arbitrary-Scale Super-Resolution (ASSR) by using MLP to decode Low-Resolution (LR) features. However, these continuous image representations typically implement decoding in High-Resolution (HR) High-Dimensional (HD) space, leading to a quadratic increase in computational cost and seriously hindering the practical applications of ASSR. To tackle this problem, we propose a novel Latent Modulated Function (LMF), which decouples the HR-HD decoding process into shared latent decoding in LR-HD space and independent rendering in HR Low-Dimensional (LD) space, thereby realizing the first computational optimal paradigm of continuous image representation. Specifically, LMF utilizes an HD MLP in latent space to generate latent modulations of each LR feature vector. This enables a modulated LD MLP in render space to quickly adapt to any input feature vector and perform rendering at arbitrary resolution. Furthermore, we leverage the positive correlation between modulation intensity and input image complexity to design a Controllable Multi-Scale Rendering (CMSR) algorithm, offering the flexibility to adjust the decoding efficiency based on the rendering precision. Extensive experiments demonstrate that converting existing INR-based ASSR methods to LMF can reduce the computational cost by up to 99.9%, accelerate inference by up to 57 times, and save up to 76% of parameters, while maintaining competitive performance. The code is available at https://github.com/HeZongyao/LMF.

Latent Modulated Function for Computational Optimal Continuous Image Representation

TL;DR

This work tackles the high computational cost of decoding in High-Resolution High-Dimensional space inherent to INR-based Arbitrary-Scale Super-Resolution. It introduces Latent Modulated Function (LMF), which decouples decoding into a shared LR-HD latent path and a lightweight HR-LD render path, using a latent MLP to generate modulation that conditionally guides a small render MLP. A Controllable Multi-Scale Rendering (CMSR) strategy ties modulation strength to input complexity, enabling test-time trade-offs between precision and speed. Across multiple encoders and benchmarks, LMF achieves substantial reductions in MACs, runtime, and parameter count while maintaining competitive PSNR, demonstrating practical, scalable continuous image representation. The approach is compatible with existing INR-based ASSR methods and offers a path toward efficient, arbitrary-resolution rendering in real-world applications.

Abstract

The recent work Local Implicit Image Function (LIIF) and subsequent Implicit Neural Representation (INR) based works have achieved remarkable success in Arbitrary-Scale Super-Resolution (ASSR) by using MLP to decode Low-Resolution (LR) features. However, these continuous image representations typically implement decoding in High-Resolution (HR) High-Dimensional (HD) space, leading to a quadratic increase in computational cost and seriously hindering the practical applications of ASSR. To tackle this problem, we propose a novel Latent Modulated Function (LMF), which decouples the HR-HD decoding process into shared latent decoding in LR-HD space and independent rendering in HR Low-Dimensional (LD) space, thereby realizing the first computational optimal paradigm of continuous image representation. Specifically, LMF utilizes an HD MLP in latent space to generate latent modulations of each LR feature vector. This enables a modulated LD MLP in render space to quickly adapt to any input feature vector and perform rendering at arbitrary resolution. Furthermore, we leverage the positive correlation between modulation intensity and input image complexity to design a Controllable Multi-Scale Rendering (CMSR) algorithm, offering the flexibility to adjust the decoding efficiency based on the rendering precision. Extensive experiments demonstrate that converting existing INR-based ASSR methods to LMF can reduce the computational cost by up to 99.9%, accelerate inference by up to 57 times, and save up to 76% of parameters, while maintaining competitive performance. The code is available at https://github.com/HeZongyao/LMF.
Paper Structure (14 sections, 5 equations, 5 figures, 7 tables)

This paper contains 14 sections, 5 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Efficiency comparisons ($320\times180$ input) for ASSR. LMF-based ASSR methods significantly reduce computational cost (MACs), runtime (circle sizes), and parameters (colors).
  • Figure 2: The framework of our LMF-based continuous image representation. Given a latent code generated by the encoder, an HD latent MLP generates the latent modulation to adjust the features of an LD render MLP, thereby achieving efficient arbitrary-resolution rendering.
  • Figure 3: Visualization ($\times4$ SR) of the normalized mean of shift modulation (left), and the normalized residual between the LM-LIIF predicted image and the bilinear upsampled image (right).
  • Figure 4: The positive correlation between the means of shift modulation and the minimum rendering scale factors in LM-LIIF.
  • Figure 5: Qualitative comparison for ASSR. All of the ASSR methods use SwinIR as the encoder.