LLM-YOLOMS: Large Language Model-based Semantic Interpretation and Fault Diagnosis for Wind Turbine Components

Yaru Li; Yanxue Wang; Meng Li; Xinming Li; Jianbo Feng

LLM-YOLOMS: Large Language Model-based Semantic Interpretation and Fault Diagnosis for Wind Turbine Components

Yaru Li, Yanxue Wang, Meng Li, Xinming Li, Jianbo Feng

TL;DR

The paper presents LLM-YOLOMS, a hybrid framework that fuses a high-precision YOLOMS detector with multimodal and domain-adapted LLMs to deliver semantic fault interpretation and maintenance guidance for wind turbine components. A lightweight Key-Value mapping bridges visual outputs to structured textual inputs, preserving rich fault attributes, while Qwen2-VL and a domain-tuned Llama model enable deep reasoning, fault analysis, and actionable maintenance recommendations. Empirical results show robust detection performance (e.g., mAP50 = 0.896; per-class APs up to 0.955) and high-quality diagnostic reports (APS ≈ 0.89; FCS ≈ 0.94), illustrating improved interpretability and decision support over traditional detection pipelines. The framework demonstrates practical impact by providing end-to-end fault detection, analysis, and O&M guidance, with a scalable pipeline built on UAV imagery, multi-scale data augmentation, and domain-specific fine-tuning.

Abstract

The health condition of wind turbine (WT) components is crucial for ensuring stable and reliable operation. However, existing fault detection methods are largely limited to visual recognition, producing structured outputs that lack semantic interpretability and fail to support maintenance decision-making. To address these limitations, this study proposes an integrated framework that combines YOLOMS with a large language model (LLM) for intelligent fault analysis and diagnosis. Specifically, YOLOMS employs multi-scale detection and sliding-window cropping to enhance fault feature extraction, while a lightweight key-value (KV) mapping module bridges the gap between visual outputs and textual inputs. This module converts YOLOMS detection results into structured textual representations enriched with both qualitative and quantitative attributes. A domain-tuned LLM then performs semantic reasoning to generate interpretable fault analyses and maintenance recommendations. Experiments on real-world datasets demonstrate that the proposed framework achieves a fault detection accuracy of 90.6\% and generates maintenance reports with an average accuracy of 89\%, thereby improving the interpretability of diagnostic results and providing practical decision support for the operation and maintenance of wind turbines.

LLM-YOLOMS: Large Language Model-based Semantic Interpretation and Fault Diagnosis for Wind Turbine Components

TL;DR

Abstract

LLM-YOLOMS: Large Language Model-based Semantic Interpretation and Fault Diagnosis for Wind Turbine Components

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (13)