ASMR: Activation-sharing Multi-resolution Coordinate Networks For Efficient Inference
Jason Chun Lok Li, Steven Tin Sui Luo, Le Xu, Ngai Wong
TL;DR
The paper tackles the high inference cost of implicit neural representations by proposing Activation-Sharing Multi-Resolution (ASMR), which couples multi-resolution coordinate decomposition, hierarchical modulation, and activation-sharing inference to decouple MAC from network depth. By sharing activations across grids and injecting per-level biases, ASMR attains near $O(1)$ MAC with respect to depth while preserving or improving reconstruction quality relative to vanilla SIREN. It demonstrates up to ~500× MAC reductions on high-resolution image fitting and extends effectively to natural images, audio, video, and 3D data, while enabling meta-learning and global latent structure encoding. While ASMR introduces a rasterized-data bias that can hinder smooth continuous signals like SDFs, it provides a powerful, purely implicit framework with broad practical impact for deployment under strict hardware constraints.
Abstract
Coordinate network or implicit neural representation (INR) is a fast-emerging method for encoding natural signals (such as images and videos) with the benefits of a compact neural representation. While numerous methods have been proposed to increase the encoding capabilities of an INR, an often overlooked aspect is the inference efficiency, usually measured in multiply-accumulate (MAC) count. This is particularly critical in use cases where inference throughput is greatly limited by hardware constraints. To this end, we propose the Activation-Sharing Multi-Resolution (ASMR) coordinate network that combines multi-resolution coordinate decomposition with hierarchical modulations. Specifically, an ASMR model enables the sharing of activations across grids of the data. This largely decouples its inference cost from its depth which is directly correlated to its reconstruction capability, and renders a near O(1) inference complexity irrespective of the number of layers. Experiments show that ASMR can reduce the MAC of a vanilla SIREN model by up to 500x while achieving an even higher reconstruction quality than its SIREN baseline.
