Table of Contents
Fetching ...

Efficient Spatially-Variant Convolution via Differentiable Sparse Kernel Complex

Zhizhen Wu, Zhe Cao, Yuchi Huo

TL;DR

This work tackles the high computational cost of applying large, complex convolution kernels on resource-limited devices by introducing a differentiable kernel decomposition that represents a dense kernel as a sequence of optimized sparse layers. The approach enables end-to-end gradient-based learning of sparse kernel samples and introduces a filter-space interpolation scheme to decouple kernel synthesis from image resolution for spatially varying filtering. It achieves higher fidelity than heuristic methods and significantly lower runtime compared with low-rank decompositions, enabling real-time mobile imaging and integration into learning pipelines. By combining robust initialization, differentiable optimization, and a compact basis for per-pixel filters, the method provides a practical, scalable solution for advanced image filtering in graphics and vision tasks.

Abstract

Image convolution with complex kernels is a fundamental operation in photography, scientific imaging, and animation effects, yet direct dense convolution is computationally prohibitive on resource-limited devices. Existing approximations, such as simulated annealing or low-rank decompositions, either lack efficiency or fail to capture non-convex kernels. We introduce a differentiable kernel decomposition framework that represents a target spatially-variant, dense, complex kernel using a set of sparse kernel samples. Our approach features (i) a decomposition that enables differentiable optimization of sparse kernels, (ii) a dedicated initialization strategy for non-convex shapes to avoid poor local minima, and (iii) a kernel-space interpolation scheme that extends single-kernel filtering to spatially varying filtering without retraining and additional runtime overhead. Experiments on Gaussian and non-convex kernels show that our method achieves higher fidelity than simulated annealing and significantly lower cost than low-rank decompositions. Our approach provides a practical solution for mobile imaging and real-time rendering, while remaining fully differentiable for integration into broader learning pipelines.

Efficient Spatially-Variant Convolution via Differentiable Sparse Kernel Complex

TL;DR

This work tackles the high computational cost of applying large, complex convolution kernels on resource-limited devices by introducing a differentiable kernel decomposition that represents a dense kernel as a sequence of optimized sparse layers. The approach enables end-to-end gradient-based learning of sparse kernel samples and introduces a filter-space interpolation scheme to decouple kernel synthesis from image resolution for spatially varying filtering. It achieves higher fidelity than heuristic methods and significantly lower runtime compared with low-rank decompositions, enabling real-time mobile imaging and integration into learning pipelines. By combining robust initialization, differentiable optimization, and a compact basis for per-pixel filters, the method provides a practical, scalable solution for advanced image filtering in graphics and vision tasks.

Abstract

Image convolution with complex kernels is a fundamental operation in photography, scientific imaging, and animation effects, yet direct dense convolution is computationally prohibitive on resource-limited devices. Existing approximations, such as simulated annealing or low-rank decompositions, either lack efficiency or fail to capture non-convex kernels. We introduce a differentiable kernel decomposition framework that represents a target spatially-variant, dense, complex kernel using a set of sparse kernel samples. Our approach features (i) a decomposition that enables differentiable optimization of sparse kernels, (ii) a dedicated initialization strategy for non-convex shapes to avoid poor local minima, and (iii) a kernel-space interpolation scheme that extends single-kernel filtering to spatially varying filtering without retraining and additional runtime overhead. Experiments on Gaussian and non-convex kernels show that our method achieves higher fidelity than simulated annealing and significantly lower cost than low-rank decompositions. Our approach provides a practical solution for mobile imaging and real-time rendering, while remaining fully differentiable for integration into broader learning pipelines.

Paper Structure

This paper contains 27 sections, 11 equations, 7 figures.

Figures (7)

  • Figure 1: An overview of our method. We represent a dense filter as a Sparse Kernel Complex, a sequence of sparse layers whose parameters $\Theta$ are learned via Differentiable Optimization. We apply our filter $F_{\Theta}$ to an impulse $\delta$ to yield a synthesized kernel $K_{syn}$, and minimize a loss $\mathcal{L}$ against the target $K_{tgt}$ to learn arbitrary shapes. These optimized kernels serve as a basis for high-performance Spatially Varying Filtering, achieving quality nearly-ground-truth quality at up to a 20$\times$ speedup.
  • Figure 2: Comparison of Gaussian kernel approximation with varying $\sigma$. We compare our method against PST using two sparse configurations (8 layers × 6 samples and 12 layers × 4 samples). LPIPS scores appear in the top-right corner (lower is better).
  • Figure 3: Speed, accuracy, and samples comparison. The figure plots quality against latency (lower is better for both). The size of each bubble represents the total sample count.
  • Figure 4: Comparison of Single kernel approximation.Compared to baselines, SVD-based decomposition (LowR.) and Parallel Simulated Tempering (PST), our approach (blue) better preserves sharp features on non-convex targets, resulting in lower LPIPS scores (lower is better).
  • Figure 5: Visual comparison of diverse spatially varying (SV) effects.We evaluate three SV configurations: 1D tilt-shift blur (top), 2D rotational blur (middle), and 2D radial motion blur (bottom). We compare our method against Parallel Simulated Tempering (PST) and Low-Rank Decomposition (LowRank).
  • ...and 2 more figures