Table of Contents
Fetching ...

Adaptive Resolution Residual Networks -- Generalizing Across Resolutions Easily and Efficiently

Léa Demeule, Mahtab Sandhu, Glen Berseth

TL;DR

This work addresses the challenge of processing real-world signals captured at differing resolutions by introducing Adaptive Resolution Residual Networks (ARRNs), which fuse the ease of fixed-resolution models with the robustness and efficiency of adaptive-resolution methods. ARRNs implement Laplacian residuals to embed standard fixed-resolution blocks within a hierarchical, resolution-aware framework, enabling inference-time adaptation by omitting higher-resolution computations while preserving accuracy on lower-resolution signals. A complementary Laplacian dropout regularizes across resolutions and mitigates errors from approximate smoothing kernels, with a theoretical grounding based on neural operators and a proven exact adaptation guarantee under perfect smoothing. Empirically, ARRNs demonstrate improved robustness to low resolutions, significant computational savings, and successful generalization across diverse layer types, suggesting a practical path to efficient, multi-resolution deep learning across image, audio, and volumetric data domains.

Abstract

The majority of signal data captured in the real world uses numerous sensors with different resolutions. In practice, however, most deep learning architectures are fixed-resolution; they consider a single resolution at training time and inference time. This is convenient to implement but fails to fully take advantage of the diverse signal data that exists. In contrast, other deep learning architectures are adaptive-resolution; they directly allow various resolutions to be processed at training time and inference time. This benefits robustness and computational efficiency but introduces difficult design constraints that hinder mainstream use. In this work, we address the shortcomings of both fixed-resolution and adaptive-resolution methods by introducing Adaptive Resolution Residual Networks (ARRNs), which inherit the advantages of adaptive-resolution methods and the ease of use of fixed-resolution methods. We construct ARRNs from Laplacian residuals, which serve as generic adaptive-resolution adapters for fixed-resolution layers, and which allow casting high-resolution ARRNs into low-resolution ARRNs at inference time by simply omitting high-resolution Laplacian residuals, thus reducing computational cost on low-resolution signals without compromising performance. We complement this novel component with Laplacian dropout, which regularizes for robustness to a distribution of lower resolutions, and which also regularizes for errors that may be induced by approximate smoothing kernels in Laplacian residuals. We provide a solid grounding for the advantageous properties of ARRNs through a theoretical analysis based on neural operators, and empirically show that ARRNs embrace the challenge posed by diverse resolutions with greater flexibility, robustness, and computational efficiency.

Adaptive Resolution Residual Networks -- Generalizing Across Resolutions Easily and Efficiently

TL;DR

This work addresses the challenge of processing real-world signals captured at differing resolutions by introducing Adaptive Resolution Residual Networks (ARRNs), which fuse the ease of fixed-resolution models with the robustness and efficiency of adaptive-resolution methods. ARRNs implement Laplacian residuals to embed standard fixed-resolution blocks within a hierarchical, resolution-aware framework, enabling inference-time adaptation by omitting higher-resolution computations while preserving accuracy on lower-resolution signals. A complementary Laplacian dropout regularizes across resolutions and mitigates errors from approximate smoothing kernels, with a theoretical grounding based on neural operators and a proven exact adaptation guarantee under perfect smoothing. Empirically, ARRNs demonstrate improved robustness to low resolutions, significant computational savings, and successful generalization across diverse layer types, suggesting a practical path to efficient, multi-resolution deep learning across image, audio, and volumetric data domains.

Abstract

The majority of signal data captured in the real world uses numerous sensors with different resolutions. In practice, however, most deep learning architectures are fixed-resolution; they consider a single resolution at training time and inference time. This is convenient to implement but fails to fully take advantage of the diverse signal data that exists. In contrast, other deep learning architectures are adaptive-resolution; they directly allow various resolutions to be processed at training time and inference time. This benefits robustness and computational efficiency but introduces difficult design constraints that hinder mainstream use. In this work, we address the shortcomings of both fixed-resolution and adaptive-resolution methods by introducing Adaptive Resolution Residual Networks (ARRNs), which inherit the advantages of adaptive-resolution methods and the ease of use of fixed-resolution methods. We construct ARRNs from Laplacian residuals, which serve as generic adaptive-resolution adapters for fixed-resolution layers, and which allow casting high-resolution ARRNs into low-resolution ARRNs at inference time by simply omitting high-resolution Laplacian residuals, thus reducing computational cost on low-resolution signals without compromising performance. We complement this novel component with Laplacian dropout, which regularizes for robustness to a distribution of lower resolutions, and which also regularizes for errors that may be induced by approximate smoothing kernels in Laplacian residuals. We provide a solid grounding for the advantageous properties of ARRNs through a theoretical analysis based on neural operators, and empirically show that ARRNs embrace the challenge posed by diverse resolutions with greater flexibility, robustness, and computational efficiency.

Paper Structure

This paper contains 39 sections, 15 equations, 12 figures.

Figures (12)

  • Figure 1: Visualization of a Laplacian pyramid showing blocks that correspond to recursion iterations. The starting signal $p^\text{low}_{0}$ is shown at the top left. The blocks are chained one after the other, from top-to-bottom, and produce the difference signals $p^\text{diff}_{1}$, $p^\text{diff}_{2}$, $p^\text{diff}_{3}$. The blocks each start with the signal $p^\text{low}_{n - 1}$, produce a lower bandwidth signal $p^\text{low}_{n}$ moving right, and finally produce a difference signal $p^\text{diff}_{n}$ moving right again. Together, $p^\text{diff}_{1}$, $p^\text{diff}_{2}$, $p^\text{diff}_{3}$ and $p^\text{low}_{3}$ sum to the original signal $p^\text{low}_{0}$; the Laplacian pyramid produces a form of linear decomposition.
  • Figure 2: Visualization of an ARRN showing blocks that correspond to Laplacian residuals, displaying a principal component analysis of feature maps. The starting signal $r_{0}$ is shown at the top left. The blocks are chained one after the other, from top-to-bottom. The blocks each start with the previous residual signal $r_{n - 1}$, and, moving left-to-right, produce a lower bandwidth signal $r^\text{low}_n$ and difference signal $r^\text{diff}_n$ as in Laplacian pyramids, gate the difference signal against the chained Laplacian dropout variable $d^\text{chain}_n$, apply the neural architectural block $b_{n}$, apply the zero-blocking filter $\psi$, sum the lower bandwidth signal $r^\text{low}_n$, apply the smoothing kernel $\phi_{n + 1}$ and resample, then finally project through $\mathbf{A}_n$ to produce the residual signal $r_{n}$.
  • Figure 3: Accuracy of all methods at various resolutions, where each model is trained at the full dataset resolution and tested at a range of lower resolutions. Our method (red full line) displays the best accuracy at the highest resolution and robustly maintains its accuracy at lower resolutions.
  • Figure 4: Inference time of all methods at various resolutions, where the inference time for the entire dataset is considered. Our method (red full line) can adapt to lower resolutions by skipping Laplacian residuals, which results in significant computational savings (highlighted by the shaded area) compared to using all Laplacian residuals (red dashed line). Our method also displays a reasonable inference time relative to typical convolutional neural networks despite not having a highly optimized implementation.
  • Figure 5: Accuracy of two groups of methods at various resolutions, where 7 adaptive-resolution architectures (in red) are constructed by taking 7 fixed-resolution architectures (in green) and wrapping their layers in Laplacian residuals. Our method yields architectures that have much stronger low-resolution performance, and similar or better high-resolution performance, which demonstrates ease of use and wide compatibility with mainstream layers.
  • ...and 7 more figures