Table of Contents
Fetching ...

Deep Blind Super-Resolution for Satellite Video

Yi Xiao, Qiangqiang Yuan, Qiang Zhang, Liangpei Zhang

TL;DR

This article proposes a practical blind SVSR algorithm (BSVSR) to explore more sharp cues by considering the pixelwise blur levels in a coarse-to-fine manner and devise a pyramid spatial transformation module to adjust the solution space of sharp mid-feature, resulting in flexible feature adaptation in multilevel domains.

Abstract

Recent efforts have witnessed remarkable progress in Satellite Video Super-Resolution (SVSR). However, most SVSR methods usually assume the degradation is fixed and known, e.g., bicubic downsampling, which makes them vulnerable in real-world scenes with multiple and unknown degradations. To alleviate this issue, blind SR has thus become a research hotspot. Nevertheless, existing approaches are mainly engaged in blur kernel estimation while losing sight of another critical aspect for VSR tasks: temporal compensation, especially compensating for blurry and smooth pixels with vital sharpness from severely degraded satellite videos. Therefore, this paper proposes a practical Blind SVSR algorithm (BSVSR) to explore more sharp cues by considering the pixel-wise blur levels in a coarse-to-fine manner. Specifically, we employed multi-scale deformable convolution to coarsely aggregate the temporal redundancy into adjacent frames by window-slid progressive fusion. Then the adjacent features are finely merged into mid-feature using deformable attention, which measures the blur levels of pixels and assigns more weights to the informative pixels, thus inspiring the representation of sharpness. Moreover, we devise a pyramid spatial transformation module to adjust the solution space of sharp mid-feature, resulting in flexible feature adaptation in multi-level domains. Quantitative and qualitative evaluations on both simulated and real-world satellite videos demonstrate that our BSVSR performs favorably against state-of-the-art non-blind and blind SR models. Code will be available at https://github.com/XY-boy/Blind-Satellite-VSR

Deep Blind Super-Resolution for Satellite Video

TL;DR

This article proposes a practical blind SVSR algorithm (BSVSR) to explore more sharp cues by considering the pixelwise blur levels in a coarse-to-fine manner and devise a pyramid spatial transformation module to adjust the solution space of sharp mid-feature, resulting in flexible feature adaptation in multilevel domains.

Abstract

Recent efforts have witnessed remarkable progress in Satellite Video Super-Resolution (SVSR). However, most SVSR methods usually assume the degradation is fixed and known, e.g., bicubic downsampling, which makes them vulnerable in real-world scenes with multiple and unknown degradations. To alleviate this issue, blind SR has thus become a research hotspot. Nevertheless, existing approaches are mainly engaged in blur kernel estimation while losing sight of another critical aspect for VSR tasks: temporal compensation, especially compensating for blurry and smooth pixels with vital sharpness from severely degraded satellite videos. Therefore, this paper proposes a practical Blind SVSR algorithm (BSVSR) to explore more sharp cues by considering the pixel-wise blur levels in a coarse-to-fine manner. Specifically, we employed multi-scale deformable convolution to coarsely aggregate the temporal redundancy into adjacent frames by window-slid progressive fusion. Then the adjacent features are finely merged into mid-feature using deformable attention, which measures the blur levels of pixels and assigns more weights to the informative pixels, thus inspiring the representation of sharpness. Moreover, we devise a pyramid spatial transformation module to adjust the solution space of sharp mid-feature, resulting in flexible feature adaptation in multi-level domains. Quantitative and qualitative evaluations on both simulated and real-world satellite videos demonstrate that our BSVSR performs favorably against state-of-the-art non-blind and blind SR models. Code will be available at https://github.com/XY-boy/Blind-Satellite-VSR
Paper Structure (28 sections, 20 equations, 11 figures, 9 tables)

This paper contains 28 sections, 20 equations, 11 figures, 9 tables.

Figures (11)

  • Figure 1: The overview of our BSVSR, which takes $2N+1=5$ consecutive blurry low-resolution frames as input and predicts the sharp high-resolution mid-frame $I^{SR}_t$.
  • Figure 2: The flowchart of (a) the optimization of blur kernel estimation network; (b) Multi-Scale Deformable (MSD) convolution alignment 9530280 used for coarse compensation; (c) n-th blur-aware transformation network, which receives blur-aware feature ${\rm{{\cal F}}}_t^s$ and the result of previous transformation ${{\rm{{\cal H}}}_{n-1}}$ and outputs ${{\rm{{\cal H}}}_{n}}$; (d) Deformable Attention (DA) used for fine compensation.
  • Figure 3: The illustration of the proposed progressive temporal compensation strategy, which used Multi-Scale Deformable convolution (MSD) compensation and Deformable Attention (DA) fusion to explore more clean and sharp cues in a coarse-to-fine manner. The color of the sampling points represents the attention weights used for aggregation. By assigning higher attention weights to clean and sharp points, we can encourage the representation of vital sharpness and eliminate blurry pixels.
  • Figure 4: Qualitative results on Scene-1, Scene-2 and Scene-4 from Jilin-1 testset with various blur kernels. The size of the Region Of Interest (ROI) is $70 \times 70$. Our method recovers more sharp and clean details than state-of-the-art non-blind and blind SR methods.
  • Figure 5: Qualitative results on UrtheCast test set with blur kernel width $\sigma=1.6$. The size of the Region Of Interest (ROI) is $90 \times 90$. Our method has fewer distortions and restores more textures than other state-of-the-art non-blind and blind SR methods.
  • ...and 6 more figures