Table of Contents
Fetching ...

Sewer Image Super-Resolution with Depth Priors and Its Lightweight Network

Gang Pan, Chen Wang, Zhijie Sui, Shuai Guo, Yaozhi Lv, Honglie Li, Di Sun, Zixia Xia

TL;DR

This work introduces DSRNet, a depth-guided, reference-based super-resolution framework for sewer imagery that leverages depth priors ($D_{LR}$, $D_{Ref\downarrow}$) and adjacent frames as references to recover high-frequency texture on low-resolution Quick-view images. It components into a depth extraction network, an encoder, a depth matching module (DMM), a decoder, and a discriminator, with a DRIMM-based mechanism aligning LR and reference features guided by depth information. A knowledge-distillation pathway with an Attention-based Distillation Module (ADM) compresses the model to a lighter student while preserving accuracy, yielding DSRNet-S and distillation variants like DSRNet-D that balance speed and performance. Experiments on the Deep Sewer SR Dataset and downstream tasks on Pipe and Sewer-ML datasets show significant PSNR/SSIM gains over baselines, improved sewer defect segmentation, localization, and classification with SR, and a favorable speed-accuracy trade-off through distillation. The approach demonstrates practical impact for embedded QV devices by improving visual quality and enabling more reliable downstream defect analysis in real-world sewer networks.

Abstract

The Quick-view (QV) technique serves as a primary method for detecting defects within sewerage systems. However, the effectiveness of QV is impeded by the limited visual range of its hardware, resulting in suboptimal image quality for distant portions of the sewer network. Image super-resolution is an effective way to improve image quality and has been applied in a variety of scenes. However, research on super-resolution for sewer images remains considerably unexplored. In response, this study leverages the inherent depth relationships present within QV images and introduces a novel Depth-guided, Reference-based Super-Resolution framework denoted as DSRNet. It comprises two core components: a depth extraction module and a depth information matching module (DMM). DSRNet utilizes the adjacent frames of the low-resolution image as reference images and helps them recover texture information based on the correlation. By combining these modules, the integration of depth priors significantly enhances both visual quality and performance benchmarks. Besides, in pursuit of computational efficiency and compactness, a super-resolution knowledge distillation model based on an attention mechanism is introduced. This mechanism facilitates the acquisition of feature similarity between a more complex teacher model and a streamlined student model, with the latter being a lightweight version of DSRNet. Experimental results demonstrate that DSRNet significantly improves PSNR and SSIM compared with other methods. This study also conducts experiments on sewer defect semantic segmentation, object detection, and classification on the Pipe dataset and Sewer-ML dataset. Experiments show that the method can improve the performance of low-resolution sewer images in these tasks.

Sewer Image Super-Resolution with Depth Priors and Its Lightweight Network

TL;DR

This work introduces DSRNet, a depth-guided, reference-based super-resolution framework for sewer imagery that leverages depth priors (, ) and adjacent frames as references to recover high-frequency texture on low-resolution Quick-view images. It components into a depth extraction network, an encoder, a depth matching module (DMM), a decoder, and a discriminator, with a DRIMM-based mechanism aligning LR and reference features guided by depth information. A knowledge-distillation pathway with an Attention-based Distillation Module (ADM) compresses the model to a lighter student while preserving accuracy, yielding DSRNet-S and distillation variants like DSRNet-D that balance speed and performance. Experiments on the Deep Sewer SR Dataset and downstream tasks on Pipe and Sewer-ML datasets show significant PSNR/SSIM gains over baselines, improved sewer defect segmentation, localization, and classification with SR, and a favorable speed-accuracy trade-off through distillation. The approach demonstrates practical impact for embedded QV devices by improving visual quality and enabling more reliable downstream defect analysis in real-world sewer networks.

Abstract

The Quick-view (QV) technique serves as a primary method for detecting defects within sewerage systems. However, the effectiveness of QV is impeded by the limited visual range of its hardware, resulting in suboptimal image quality for distant portions of the sewer network. Image super-resolution is an effective way to improve image quality and has been applied in a variety of scenes. However, research on super-resolution for sewer images remains considerably unexplored. In response, this study leverages the inherent depth relationships present within QV images and introduces a novel Depth-guided, Reference-based Super-Resolution framework denoted as DSRNet. It comprises two core components: a depth extraction module and a depth information matching module (DMM). DSRNet utilizes the adjacent frames of the low-resolution image as reference images and helps them recover texture information based on the correlation. By combining these modules, the integration of depth priors significantly enhances both visual quality and performance benchmarks. Besides, in pursuit of computational efficiency and compactness, a super-resolution knowledge distillation model based on an attention mechanism is introduced. This mechanism facilitates the acquisition of feature similarity between a more complex teacher model and a streamlined student model, with the latter being a lightweight version of DSRNet. Experimental results demonstrate that DSRNet significantly improves PSNR and SSIM compared with other methods. This study also conducts experiments on sewer defect semantic segmentation, object detection, and classification on the Pipe dataset and Sewer-ML dataset. Experiments show that the method can improve the performance of low-resolution sewer images in these tasks.
Paper Structure (26 sections, 26 equations, 11 figures, 11 tables)

This paper contains 26 sections, 26 equations, 11 figures, 11 tables.

Figures (11)

  • Figure 1: Images captured in sewer at different viewing distances. (a): Images at ideal viewing distances. (b): Images at far viewing distances. Image resolution is lower at far viewing distances
  • Figure 2: The overall architecture of proposed DSRNet contains five module: depth extraction network, encoder, depth matching module (DMM), decoder, and discriminator. LR, Ref$\downarrow$, Ref, HR, and SR represent low-resolution images, downsampled reference images, reference images, high-resolution images and super-resolution images
  • Figure 3: The architecture of DMM which contains three main modules: depth encoder, depth information fusion module and depth-based reference image matching module
  • Figure 4: The overall architecture of knowledge distillation based on attention feature matching. The model uses DSRNet called DSRNet-T as the teacher model, and a lightweight DSRNet called DSRNet-S as student model. The Attention-based Distillation Module (ADM) can effectively selects, matches, and distills the essential features from the teacher model and the student model
  • Figure 5: Attention-based distillation module(ADM)
  • ...and 6 more figures