CoordGate: Efficiently Computing Spatially-Varying Convolutions in Convolutional Neural Networks
Sunny Howard, Peter Norreys, Andreas Döpp
TL;DR
CoordGate tackles the inefficiency of learning spatially varying convolutions in CNNs by gating a standard CNN feature map with a coordinate-encoding network to produce per-pixel filter amplitudes. The gating uses a Hadamard product between the CNN output and a coordinate-derived gating map, enabling spatially varying filtering with minimal parameter overhead. It is validated on a 1D synthetic spatially varying convolution task and a 2D image deblurring task with static PSFs applied to microscopy data, outperforming CoordConv-UNet and MultiWienerNet while using far fewer parameters than deep baselines. The approach promises improved efficiency and accuracy for spatially-aware vision tasks in optical imaging and related domains.
Abstract
Optical imaging systems are inherently limited in their resolution due to the point spread function (PSF), which applies a static, yet spatially-varying, convolution to the image. This degradation can be addressed via Convolutional Neural Networks (CNNs), particularly through deblurring techniques. However, current solutions face certain limitations in efficiently computing spatially-varying convolutions. In this paper we propose CoordGate, a novel lightweight module that uses a multiplicative gate and a coordinate encoding network to enable efficient computation of spatially-varying convolutions in CNNs. CoordGate allows for selective amplification or attenuation of filters based on their spatial position, effectively acting like a locally connected neural network. The effectiveness of the CoordGate solution is demonstrated within the context of U-Nets and applied to the challenging problem of image deblurring. The experimental results show that CoordGate outperforms conventional approaches, offering a more robust and spatially aware solution for CNNs in various computer vision applications.
