Table of Contents
Fetching ...

Transmission Line Defect Detection Based on UAV Patrol Images and Vision-language Pretraining

Ke Zhang, Zhaoye Zheng, Yurong Guo, Jiacun Wang, Jiyuan Yang, Yangjie Xiao

TL;DR

This paper tackles the challenge of detecting transmission-line defects from UAV patrol images where defect cues are often weak due to distance and angles. It introduces VLP-TL, a domain-specific vision-language pretraining method with SRJ and DNC tasks, to pretrain an image encoder using transmission-line image-text pairs, and a Progressive Transfer Strategy (PTS) to smoothly adapt the pretrained backbone to defect detection. The approach yields a ViTDet-like detector that, when trained on the TLDD dataset, significantly improves defect detection accuracy by effectively leveraging multimodal information. The results demonstrate practical gains for UAV-based transmission-line inspection, enabling more reliable detection of small or occluded defects without increasing inference cost. The work also provides a framework for extending domain-specific VLP and transfer strategies to other power-system inspection tasks.

Abstract

Unmanned aerial vehicle (UAV) patrol inspection has emerged as a predominant approach in transmission line monitoring owing to its cost-effectiveness. Detecting defects in transmission lines is a critical task during UAV patrol inspection. However, due to imaging distance and shooting angles, UAV patrol images often suffer from insufficient defect-related visual information, which has an adverse effect on detection accuracy. In this article, we propose a novel method for detecting defects in UAV patrol images, which is based on vision-language pretraining for transmission line (VLP-TL) and a progressive transfer strategy (PTS). Specifically, VLP-TL contains two novel pretraining tasks tailored for the transmission line scenario, aimimg at pretraining an image encoder with abundant knowledge acquired from both visual and linguistic information. Transferring the pretrained image encoder to the defect detector as its backbone can effectively alleviate the insufficient visual information problem. In addition, the PTS further improves transfer performance by progressively bridging the gap between pretraining and downstream defection detection. Experimental results demonstrate that the proposed method significantly improves defect detection accuracy by jointly utilizing multimodal information, overcoming the limitations of insufficient defect-related visual information provided by UAV patrol images.

Transmission Line Defect Detection Based on UAV Patrol Images and Vision-language Pretraining

TL;DR

This paper tackles the challenge of detecting transmission-line defects from UAV patrol images where defect cues are often weak due to distance and angles. It introduces VLP-TL, a domain-specific vision-language pretraining method with SRJ and DNC tasks, to pretrain an image encoder using transmission-line image-text pairs, and a Progressive Transfer Strategy (PTS) to smoothly adapt the pretrained backbone to defect detection. The approach yields a ViTDet-like detector that, when trained on the TLDD dataset, significantly improves defect detection accuracy by effectively leveraging multimodal information. The results demonstrate practical gains for UAV-based transmission-line inspection, enabling more reliable detection of small or occluded defects without increasing inference cost. The work also provides a framework for extending domain-specific VLP and transfer strategies to other power-system inspection tasks.

Abstract

Unmanned aerial vehicle (UAV) patrol inspection has emerged as a predominant approach in transmission line monitoring owing to its cost-effectiveness. Detecting defects in transmission lines is a critical task during UAV patrol inspection. However, due to imaging distance and shooting angles, UAV patrol images often suffer from insufficient defect-related visual information, which has an adverse effect on detection accuracy. In this article, we propose a novel method for detecting defects in UAV patrol images, which is based on vision-language pretraining for transmission line (VLP-TL) and a progressive transfer strategy (PTS). Specifically, VLP-TL contains two novel pretraining tasks tailored for the transmission line scenario, aimimg at pretraining an image encoder with abundant knowledge acquired from both visual and linguistic information. Transferring the pretrained image encoder to the defect detector as its backbone can effectively alleviate the insufficient visual information problem. In addition, the PTS further improves transfer performance by progressively bridging the gap between pretraining and downstream defection detection. Experimental results demonstrate that the proposed method significantly improves defect detection accuracy by jointly utilizing multimodal information, overcoming the limitations of insufficient defect-related visual information provided by UAV patrol images.

Paper Structure

This paper contains 15 sections, 16 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: An overview of the proposed method.
  • Figure 2: Relations between different transmission line samples. STSS, STDS, and DT are abbreviations for same type and same status, same type but different status, and different type, respectively.
  • Figure 3: The implementation of SRJ.
  • Figure 4: The implementation of DNC, taking the case of grading rings as an example.
  • Figure 5: The method to obtain instance-level images with context.
  • ...and 3 more figures