Table of Contents
Fetching ...

Method of UAV Inspection of Photovoltaic Modules Using Thermal and RGB Data Fusion

Andrii Lysyi, Anatoliy Sachenko, Pavlo Radiuk, Mykola Lysyi, Oleksandr Melnychenko, Diana Zahorodnia

TL;DR

This paper tackles scalable UAV-based PV inspection by addressing palette bias, multi-modal fusion, geo-spatial de-duplication, and bandwidth constraints. It introduces a palette-invariant thermal embedding fused with RGB via a gated mechanism, plus Rodrigues-based adaptive reacquisition and Haversine–DBSCAN geo-clustering, all operating with relevance-only onboard telemetry. The approach delivers state-of-the-art mAP on PVF-10 (0.903) with strong field validation (96% recall) and substantially reduces duplicate alerts and data transmission (Dup-FP reduced by 12–15%, bandwidth down by >60%). The work offers a practical, edge- deployable solution for automated, geo-referenced PV maintenance with potential for predictive analytics and deeper plant-wide integration.

Abstract

The subject of this research is the development of an intelligent, integrated framework for the automated inspection of photovoltaic (PV) infrastructure that addresses the critical shortcomings of conventional methods, including thermal palette bias, data redundancy, and high communication bandwidth requirements. The goal of this study is to design, develop, and validate a comprehensive, multi-modal system that fully automates the monitoring workflow, from data acquisition to the generation of actionable, geo-located maintenance alerts, thereby enhancing plant safety and operational efficiency. The methods employed involve a synergistic architecture that begins with a palette-invariant thermal embedding, learned by enforcing representational consistency, which is fused with a contrast-normalized RGB stream via a gated mechanism. This is supplemented by a closed-loop, adaptive re-acquisition controller that uses Rodrigues-based updates for targeted confirmation of ambiguous anomalies and a geospatial deduplication module that clusters redundant alerts using DBSCAN over the haversine distance. In conclusion, this study establishes a powerful new paradigm for proactive PV inspection, with the proposed system achieving a mean Average Precision (mAP@0.5) of 0.903 on the public PVF-10 benchmark, a significant 12-15% improvement over single-modality baselines. Field validation confirmed the system's readiness, achieving 96% recall, while the de-duplication process reduced duplicate-induced false positives by 15-20%, and relevance-only telemetry cut airborne data transmission by 60-70%.

Method of UAV Inspection of Photovoltaic Modules Using Thermal and RGB Data Fusion

TL;DR

This paper tackles scalable UAV-based PV inspection by addressing palette bias, multi-modal fusion, geo-spatial de-duplication, and bandwidth constraints. It introduces a palette-invariant thermal embedding fused with RGB via a gated mechanism, plus Rodrigues-based adaptive reacquisition and Haversine–DBSCAN geo-clustering, all operating with relevance-only onboard telemetry. The approach delivers state-of-the-art mAP on PVF-10 (0.903) with strong field validation (96% recall) and substantially reduces duplicate alerts and data transmission (Dup-FP reduced by 12–15%, bandwidth down by >60%). The work offers a practical, edge- deployable solution for automated, geo-referenced PV maintenance with potential for predictive analytics and deeper plant-wide integration.

Abstract

The subject of this research is the development of an intelligent, integrated framework for the automated inspection of photovoltaic (PV) infrastructure that addresses the critical shortcomings of conventional methods, including thermal palette bias, data redundancy, and high communication bandwidth requirements. The goal of this study is to design, develop, and validate a comprehensive, multi-modal system that fully automates the monitoring workflow, from data acquisition to the generation of actionable, geo-located maintenance alerts, thereby enhancing plant safety and operational efficiency. The methods employed involve a synergistic architecture that begins with a palette-invariant thermal embedding, learned by enforcing representational consistency, which is fused with a contrast-normalized RGB stream via a gated mechanism. This is supplemented by a closed-loop, adaptive re-acquisition controller that uses Rodrigues-based updates for targeted confirmation of ambiguous anomalies and a geospatial deduplication module that clusters redundant alerts using DBSCAN over the haversine distance. In conclusion, this study establishes a powerful new paradigm for proactive PV inspection, with the proposed system achieving a mean Average Precision (mAP@0.5) of 0.903 on the public PVF-10 benchmark, a significant 12-15% improvement over single-modality baselines. Field validation confirmed the system's readiness, achieving 96% recall, while the de-duplication process reduced duplicate-induced false positives by 15-20%, and relevance-only telemetry cut airborne data transmission by 60-70%.

Paper Structure

This paper contains 24 sections, 5 equations, 10 figures, 5 tables, 1 algorithm.

Figures (10)

  • Figure 1: Per-class Average Precision (AP@0.5) on the PVF-10 dataset. Our proposed Thermal+RGB fusion model consistently outperforms both the Thermal-only and RGB-only baselines across all 10 defect classes, demonstrating the robust and generalized benefit of the multi-modal approach.
  • Figure 2: Precision--recall curves on the (a) PVF-10 and (b) STHS-277 datasets. The proposed palette-aware Thermal+RGB fusion model (blue) demonstrates superior performance by maintaining higher precision across all recall levels compared to the thermal-only (orange) and RGB-only (green) baselines.
  • Figure 3: Component-wise contribution to performance on the PVF-10 dataset. The bar chart visually represents the data from Table \ref{['tab:ablation_study']}, showing the progressive increase in mAP@0.5 (blue) and small-target recall (orange) as each key component of our system is added. This demonstrates their powerful synergistic effect.
  • Figure 4: The impact of geo de-duplication on the rate of Dup-FP. The module significantly reduces redundant alerts on both (a) PVF-10 and (b) STHS-277 datasets.
  • Figure 5: Sensitivity analysis of the DBSCAN radius parameter ($\varepsilon$) on the final Dup-FP rate for the (a) PVF-10 and (b) STHS-277 datasets. An epsilon value of 1.0 meter provides the optimal trade-off, effectively minimizing duplicate detections without incorrectly merging distinct, nearby defects.
  • ...and 5 more figures