Table of Contents
Fetching ...

Perspective-Equivariance for Unsupervised Imaging with Camera Geometry

Andrew Wang, Mike Davies

TL;DR

It is shown that the much richer non-linear class of group transforms, derived from camera geometry, generalises previous EI work and is an excellent prior for satellite and urban image data.

Abstract

Ill-posed image reconstruction problems appear in many scenarios such as remote sensing, where obtaining high quality images is crucial for environmental monitoring, disaster management and urban planning. Deep learning has seen great success in overcoming the limitations of traditional methods. However, these inverse problems rarely come with ground truth data, highlighting the importance of unsupervised learning from partial and noisy measurements alone. We propose perspective-equivariant imaging (EI), a framework that leverages classical projective camera geometry in optical imaging systems, such as satellites or handheld cameras, to recover information lost in ill-posed camera imaging problems. We show that our much richer non-linear class of group transforms, derived from camera geometry, generalises previous EI work and is an excellent prior for satellite and urban image data. Perspective-EI achieves state-of-the-art results in multispectral pansharpening, outperforming other unsupervised methods in the literature. Code at https://github.com/Andrewwango/perspective-equivariant-imaging.

Perspective-Equivariance for Unsupervised Imaging with Camera Geometry

TL;DR

It is shown that the much richer non-linear class of group transforms, derived from camera geometry, generalises previous EI work and is an excellent prior for satellite and urban image data.

Abstract

Ill-posed image reconstruction problems appear in many scenarios such as remote sensing, where obtaining high quality images is crucial for environmental monitoring, disaster management and urban planning. Deep learning has seen great success in overcoming the limitations of traditional methods. However, these inverse problems rarely come with ground truth data, highlighting the importance of unsupervised learning from partial and noisy measurements alone. We propose perspective-equivariant imaging (EI), a framework that leverages classical projective camera geometry in optical imaging systems, such as satellites or handheld cameras, to recover information lost in ill-posed camera imaging problems. We show that our much richer non-linear class of group transforms, derived from camera geometry, generalises previous EI work and is an excellent prior for satellite and urban image data. Perspective-EI achieves state-of-the-art results in multispectral pansharpening, outperforming other unsupervised methods in the literature. Code at https://github.com/Andrewwango/perspective-equivariant-imaging.
Paper Structure (28 sections, 4 theorems, 11 equations, 10 figures, 4 tables)

This paper contains 28 sections, 4 theorems, 11 equations, 10 figures, 4 tables.

Key Result

proposition 1

Following hartley_multiple_2004. Two 2D images $\mathbf{x},\mathbf{x}^\prime$ taken of a 3D world from two cameras at arbitrary 3D orientations but the same coincident camera centre can be related via a projective transformation, written as a linear transformation in homogeneous coordinates. This ca

Figures (10)

  • Figure 1: Unsupervised pansharpening with perspective-equivariant imaging. Panchromatic (PAN) and low-res multispectral (MS) images are inputs. Reconstructions and average no-reference QNR (higher is better) alparone_multispectral_2008 results for: upper 2 rows: commercially provided images from the bespoke classical Hyperspherical Color Space (HCS) padwick_worldview-2_2010 method designed for WorldView-2 maxar_worldview-2_nodate which we treat as both a baseline and as oracle, linear upsampling baseline, oracle supervised training, self-supervised learning using Wald's protocol wald_fusion_1997. Lower 2 rows: unsupervised loss functions from competitor methods uezato_guided_2020ciotola_pansharpening_2022luo_pansharpening_2020ma_pan-gan_2020, shift-EI chen_equivariant_2021, and our method using our proposed unsupervised loss. All deep learning methods use the same neural network backbone for fair and balanced comparison of the loss functions; we are not comparing the different NN architectures from the literature. First 3 channels (RGB) are shown for visualisation. See \ref{['tab:result_pan_noiseless']} for more results.
  • Figure 2: The equivariant imaging framework chen_equivariant_2021 for which we propose a new, richer $T_g$ and apply to a new inverse problem $A(\cdot)$. $\mathcal{L}_\text{MC}$ is any measurement consistency loss MSE or, in noisy measurements, SURE (\ref{['eq:sure_loss']}), and $\mathcal{L}_\text{EI}$ is an MSE that enforces equivariance of the system. Bottom left: example of $\mathbf{\hat{x}}^\prime$ for $\mathbf{x}$ in \ref{['fig:spacenet_pansharpen']}.
  • Figure 3: Examples of datasets and associated inverse problems.
  • Figure 4: Pinhole camera model. The world frame $(X,Y,Z)$ is arbitrary since we are unaware of absolute locations in the real world. $(x,y,z)$ is the camera frame on which the image coordinates $(u,v)$ are defined. The camera is defined by its extrinsic parameters (position and orientation of $(x,y,z)$ wrt. $(X,Y,Z)$) and intrinsics (focal length $f$, principal point $(u_0,v_0)$, pixel length $m_x,m_y$ and pixel skew $s$).
  • Figure 5: Examples of transformations $\mathbf{\bar{x}}^\prime=\mathbf{T}_g\mathbf{\bar{x}}$ applied to an image from CelebA liu_deep_2015.
  • ...and 5 more figures

Theorems & Definitions (5)

  • proposition 1
  • corollary 1
  • theorem 1
  • theorem 2
  • definition 1