A Comprehensive Case Study on the Performance of Machine Learning Methods on the Classification of Solar Panel Electroluminescence Images

Xinyi Song; Kennedy Odongo; Francis G. Pascual; Yili Hong

A Comprehensive Case Study on the Performance of Machine Learning Methods on the Classification of Solar Panel Electroluminescence Images

Xinyi Song, Kennedy Odongo, Francis G. Pascual, Yili Hong

TL;DR

This study addresses classifying solar panel electroluminescence images into four defectiveness classes ($J=4$) across monocrystalline and polycrystalline PV modules, highlighting severe data imbalance. It systematically compares traditional ML models using features from a pre-trained VGG-16 backbone and deep transfer learning models (VGG-19 and ResNet-50), assessed under multiple metrics and validated by $50$ random replications. The authors provide a comprehensive evaluation framework, including data augmentation, two-stage fine-tuning, and a detailed discussion of accuracy, balanced accuracy, MCC, and F1 measures to guide practitioners. The work demonstrates that DL methods generally excel in raw accuracy but can be limited by imbalance, and offers practical guidelines and future directions for improving minority-class performance in EL image classification.

Abstract

Photovoltaics (PV) are widely used to harvest solar energy, an important form of renewable energy. Photovoltaic arrays consist of multiple solar panels constructed from solar cells. Solar cells in the field are vulnerable to various defects, and electroluminescence (EL) imaging provides effective and non-destructive diagnostics to detect those defects. We use multiple traditional machine learning and modern deep learning models to classify EL solar cell images into different functional/defective categories. Because of the asymmetry in the number of functional vs. defective cells, an imbalanced label problem arises in the EL image data. The current literature lacks insights on which methods and metrics to use for model training and prediction. In this paper, we comprehensively compare different machine learning and deep learning methods under different performance metrics on the classification of solar cell EL images from monocrystalline and polycrystalline modules. We provide a comprehensive discussion on different metrics. Our results provide insights and guidelines for practitioners in selecting prediction methods and performance metrics.

A Comprehensive Case Study on the Performance of Machine Learning Methods on the Classification of Solar Panel Electroluminescence Images

TL;DR

This study addresses classifying solar panel electroluminescence images into four defectiveness classes (

) across monocrystalline and polycrystalline PV modules, highlighting severe data imbalance. It systematically compares traditional ML models using features from a pre-trained VGG-16 backbone and deep transfer learning models (VGG-19 and ResNet-50), assessed under multiple metrics and validated by

random replications. The authors provide a comprehensive evaluation framework, including data augmentation, two-stage fine-tuning, and a detailed discussion of accuracy, balanced accuracy, MCC, and F1 measures to guide practitioners. The work demonstrates that DL methods generally excel in raw accuracy but can be limited by imbalance, and offers practical guidelines and future directions for improving minority-class performance in EL image classification.

Abstract

Paper Structure (20 sections, 16 equations, 13 figures, 2 tables)

This paper contains 20 sections, 16 equations, 13 figures, 2 tables.

Problem Description
The Problem
Related Literature and Contributions of This Work
Overview
Data Collection and Preparation
Data Collection
Feature Extraction and Data Augmentation
Analysis and Interpretation
Traditional Machine Learning Models
Logistic Regression
Support Vector Machine
Random Forest
Deep Learning Models
VGG Neural Network
Residual Neural Network
...and 5 more sections

Figures (13)

Figure 1: Examples of images with different severity of defectiveness in monocrystalline and polycrystalline PV cells. Here, the "D" in the label means "defective".
Figure 2: Flowchart that illustrates how the ML and DL methods are implemented in the performance study.
Figure 3: Bar plots of defectiveness counts in monocrystalline (a) and polycrystalline (b). Here the "D" in the $x$-axis label means "defective". One can see that an imbalanced problem exists in EL image data of both polycrystalline and monocrystalline.
Figure 4: Visualization of the VGG-16 architecture for feature extraction, drawn by using the neural network visualization tool in Iqbal (2018).
Figure 5: Illustration of image augmentation.
...and 8 more figures

A Comprehensive Case Study on the Performance of Machine Learning Methods on the Classification of Solar Panel Electroluminescence Images

TL;DR

Abstract

A Comprehensive Case Study on the Performance of Machine Learning Methods on the Classification of Solar Panel Electroluminescence Images

Authors

TL;DR

Abstract

Table of Contents

Figures (13)