Deep learning-based method for defect detection in electroluminescent images of polycrystalline silicon solar cells

Yuqi Liu; Yiquan Wu; YuBin Yuan; Langyue Zhao

doi:10.1364/OE.517341

1. Introduction

In recent years, with the increasing dependence on energy resources, the widespread use and consumption of non-renewable energy sources such as petroleum have led to escalating environmental pollution issues. Solar photovoltaic (PV) technology [1], known for its low production cost, lack of pollution, and renewable nature, has been widely adopted, alleviating the energy crisis in human society [2]. As the export of photovoltaic components continues to grow, the production of solar cells is on the rise. During the manufacturing process of solar cells, defects may occur due to various causes, resulting in a variety of defect types including scratches, damage, stains, and grid breaks, among others [3,4]. The presence of defects significantly diminishes the stability of PV power generation systems, and defect detection in solar cells is a crucial step to ensure their performance and quality. Therefore, to guarantee the safe and efficient operation of PV power stations, automated approaches based on deep learning play a vital role in achieving various defect detections in solar cells [5].

Typically, solar cells can be categorized into monocrystalline silicon and polycrystalline silicon based on different manufacturing materials. Monocrystalline silicon solar cells exhibit uniform background textures, while polycrystalline silicon solar cells feature numerous randomly shaped and sized crystalline particles on the surface. Defect detection in solar cells primarily involves predicting the classification and localization information of multi-scale defects in Electroluminescent (EL) images. The detection network not only possesses classification capabilities but also determines defect positions and sizes through bounding boxes, enabling precise identification and localization of defects on solar cell surfaces. Based on network structures, deep learning-based solar cell defect detection networks can be broadly classified into two categories: two-stage networks represented by Faster R-CNN [6], and one-stage networks represented by SSD [7] and YOLO [8]. Two-stage networks predict defect positions and categories based on generated defect proposals regions, but their real-time deployment efficiency is relatively low in resource-constrained scenarios. One-stage networks directly classify and locate defects using extracted features, reducing overall computational demands and making them more deployable in edge devices and embedded systems, exhibiting better adaptability in resource-constrained environments [9]. The comprehensive performance of the YOLO family of networks in the single-stage model is superior and is particularly suitable for application scenarios that require real-time performance, efficiency, and accuracy, making the YOLO family of networks a widely used framework in the field of target detection [10].

Among the existing YOLO series models, YOLOv5 demonstrates efficient deployment on resource-constrained embedded devices and edge computing platforms. Its adaptability is further enhanced by the introduction of various model variants, providing greater flexibility for industrial applications. Due to its real-time processing capabilities, ease of deployment, and the ability to balance detection accuracy, YOLOv5 has emerged as the most widely used model in industrial defect detection tasks [11]. Solar cell defects exhibit linear features and spatial continuity, with non-uniform proportions of defect areas across the entire EL image. This uneven distribution results in targets of varying scales, with some defects occupying a relatively small proportion. YOLO series models still face challenges in achieving high accuracy in multi-scale and small object detection [12]. Therefore, this paper focuses on addressing the detection issues of minute and multi-scale defects in polycrystalline silicon solar cells, proposing a model named ASDD-Net.

The primary contributions of this paper can be summarized as follows:

1. In order to effectively address subtle defects distributed on solar cells, this paper introduces the Space to Depth (SPD) module in the feature extraction section. This module performs downsampling by altering the tensor dimensions and introduces new elements in the channel dimension to fuse features at different scales. It significantly enlarges the receptive field for small target defects, alleviating the issue of potential loss of crucial feature information.
2. To enhance the network's perception of defects with different sizes and shapes, and simultaneously improve feature fusion to emphasize defect areas, this paper designs the EC2f and HAC3 feature fusion modules. The EC2f module, incorporating multi-level feature fusion and multi-head self-attention mechanisms, better integrates information from different spatial and channel dimensions, thereby enhancing the model's global perception capabilities. To improve feature expression and distinctiveness, the HAC3 module introduces ECA-Net and SimAM, further enhancing the detection of multi-scale defects.
3. To enhance the performance of the output detection heads, with a focus on the intended detection of target objects, the ASDD-Net model introduces the MobileViT_CA module before the second detection head. Leveraging its characteristics in global context modeling and channel attention enhancement improves the performance of the detection heads while maintaining the importance of local features.
4. This paper compares the detection performance with other frameworks in the same series and different detection frameworks, demonstrating the superior performance of the proposed method. Furthermore, the ASDD-Net model's generalization capability is verified on two additional datasets.

The remaining structure of this paper can be divided into the following sections: Section 2 introduces related work on solar cell defect detection, Section 3 presents the ASDD-Net model structure, Section 4 validates the effectiveness of ASDD-Net through analysis of ablation experiments, comparative experiments, and generalization experiments, and Section 5 summarizes the work and outlines future research directions.

2. Related work

Currently, deep learning models trained on extensive datasets have gained widespread popularity, with representative convolutional neural networks being widely employed in areas such as object detection, image classification, and semantic segmentation. Presently, the task of surface defect detection on solar cells is progressively being accomplished through deep learning, holding significant implications for the realization of intelligent manufacturing. However, these deep learning models are typically designed for natural scene images, and their direct application to surface defect detection in solar cell EL images presents several challenges. Consequently, researchers need to adopt task-specific strategies, such as data processing, feature engineering, and innovative neural network architectures, to better address the complexity and uniqueness of solar cell defect detection. Based on the types of industrial inspection tasks, defect detection algorithms leveraging deep learning can be categorized into three types: classification networks, detection networks, and segmentation networks.

Deitsch et al. [13] proposed two deep learning-based methods for the automatic detection of defects in PV cells using Convolutional Neural Networks (CNN) and Support Vector Machines (SVM). Experimental results showed that the CNN classifier achieved high accuracy in defect detection. Pierdicca [14] employed transfer learning with the VGG-16 network to classify remote sensing images of solar cells. However, due to the lower image resolution in the self-constructed photovoltaic module electroluminescence image dataset, the accuracy of the CNN network is only about 70%. Tang et al. [15] designed a CNN-based automatic classification model for EL image defects. They input deep features extracted by CNN into fully connected layers to classify images into four defect classes. However, the model only determines whether defects are present and cannot identify the specific defect locations or types. Sridhar et al. [16] conducted data augmentation on existing unmanned aerial vehicle captured PV images to expand the dataset. They employed a CNN model to classify samples into five fault types and a defect-free category. Their model achieved a notably high level of accuracy. Korkmaz et al. [17] modified a pre-trained architecture to design a novel multi-scale model for detecting various defects in solar panels. This approach exhibited high robustness and classification performance.

Su et al. [18] proposed a novel object detector for photovoltaic cell defect detection, incorporating a designed bidirectional feature pyramid into the model. This allowed for the effective recognition and detection of hidden cracks, grid breaks, and black spot defects. However, the feature balance factor in the algorithm still requires manual adjustment. To detect hidden cracks and grid break defects within polycrystalline solar panels, Zhang et al. [19] designed a multi-feature region proposal fusion network structure. This network extracts region proposals from different feature layers of convolutional neural networks, but the model has high computational costs and long detection times. Xu et al. [20] introduced a new spatial pyramid pooling operation and channel attention to locate cracks and fragment defects in EL images based on the YOLOv5 model. Chen et al. [21] designed a novel defect object detector that embeds a dual-channel feature pyramid into YOLOv5, significantly improving the model's ability to recognize small target defects. However, the model can detect fewer types of defects on solar cells. Balcıoğlu et al. [22]designed a visual defect detection model based on a new Deep Convolutional Neural Network. In the first stage, it detects solar cell samples containing defects and ranks them based on their level of damage. In the second stage, the selected samples are classified, effectively improving the detection performance for small-area defects in complex backgrounds. However, due to cost considerations, the resolution of images in their dataset is relatively low.

Han et al. [23] employed a two-stage approach to segment defects on multicrystalline silicon wafers. The first stage utilized a region generation network to generate potential defect region images. In the second stage, image blocks containing potential defects were processed into appropriate sizes and input into an improved U-Net for the segmentation of scratch and black spot defects. Pratt et al. [24] used a semantic segmentation model based on the U-Net architecture to identify and extract internal hidden crack defects and several other defects simultaneously in monocrystalline and polycrystalline photovoltaic modules. However, the subjectivity and ambiguity in manually annotated labels led to a significant amount of noise and uncertainty in the labeling of solar cell images. Rahman et al. [25] proposed a multi-attention network that incorporated channel attention to extract contextual information and spatial attention to effectively suppress background noise. They combined these two attention mechanisms and integrated them into the U-Net network for the segmentation task of small target defects. However, the dataset used in this study was obtained using PL imaging technology, resulting in relatively low-resolution images. Sohail et al. [26] combined the four models U-Net, attention U-Net, FPN, and LinkNet by ensemble learning technology, which significantly improved the segmentation effect of deep crack and micro-crack defects.

Fig. 1 shows EL images of PV modules made from monocrystalline (a) and polycrystalline (b) solar module and an individual polycrystalline solar cell (c). The monocrystalline silicon solar cell exhibits a uniform background texture, whereas the polycrystalline silicon solar cell surface contains numerous randomly shaped and sized crystalline particles. The focus of defect detection in this study is the polycrystalline silicon solar cell. In Fig. 1 (c), regions with lower grayscale values in the EL image of the polycrystalline silicon solar cell correspond to randomly distributed crystalline grids, forming a complex textured background in the EL image. The red-bordered area indicates a crack defect, and the blue-bordered area represents a background region similar to a crack defect, with both exhibiting relatively similar features.

Fig. 1. EL images of PV modules. (a) Images of monocrystalline silicon solar module. (b) Images of polycrystalline silicon solar module. (c) Images of polycrystalline silicon solar cells containing micro-crack defects.

Name	Formula	Description
Precision	$Precision = \frac{TP}{TP + FP}$	Percentage of correct number of defective samples that the test set was predicted to be
Recall	$Recall = \frac{TP}{TP + FN}$	Ratio of the number of defective samples detected in the test set to the number of all defective samples in the predicted data set
F1-score	$F1 - score = \frac{2 \times (P r e c i s i o n ∙ R e c a l l)}{P r e c i s i o n + R e c a l l}$	Harmonized average of precision and recall rates
AP	$A P = \int_{0}^{1} P d R$	Area under the P-R curve for a category of defects
mAP	$m A P = \frac{\sum_{N}^{i = 1} A P_{i}}{N}$	Average of AP values for all defect categories

Baseline	SPD	EC2f	HAC3	MobileViT_CA	mAP50	F1-score
√					83.88%	81.03%
√	√				85.84%	83.94%
√	√	√			86.36%	85.65%
√	√	√	√		87.01%	85.24%
√	√	√	√	√	88.81%	87.88%

Head	Feature size	Precision	Recall	mAP50	FPS
1	80 × 80 × 128	93.90%	80.42%	88.84%	54
2	40 × 40 × 256	92.37%	83.81%	88.81%	69
3	20 × 20 × 512	91.97%	83.51%	86.94%	63

Network types	Method	Backbone	Precision	Recall	mAP50
Other	FoveaBox	ResNet50	61.34%	62.33%	64.14%
	Sparse RCNN	ResNet50	87.82%	71.23%	73.31%
	Faster R-CNN	ResNet50	61.74%	61.42%	64.29%
	Cascade R-CNN	ResNet50	58.61%	57.12%	58.12%
YOLO	YOLOv5s	CSPDarknet53	91.30%	72.83%	83.88%
	YOLOv6s	EfficientRep	86.36%	82.12%	82.46%
	YOLOv7	CSPDarknet53	79.23%	61.00%	63.40%
	YOLOv8	CSPDarknet53	81.42%	75.13%	76.97%
	YOLOXs	EfficientRep	91.81%	73.74%	84.99%
	Ours	SEDF-Net	92.37%	83.81%	88.81%

Network types	Method	Backbone	Precision	Recall	mAP50
Other	FoveaBox	ResNet50	57.84%	56.97%	57.72%
	Sparse RCNN	ResNet50	69.31%	70.60%	72.44%
	Faster R-CNN	ResNet50	71.38%	70.66%	76.31%
	Cascade R-CNN	ResNet50	72.14%	71.19%	76.48%
YOLO	YOLOv5s	CSPDarknet53	72.67%	73.52%	76.53%
	YOLOv6s	EfficientRep	73.68%	72.54%	77.27%
	YOLOv7	CSPDarknet53	68.62%	68.74%	73.96%
	YOLOv8s	CSPDarknet53	73.20%	73.27%	77.71%
	YOLOXs	EfficientRep	72.55%	73.56%	77.24%
	Ours	SEDF-Net	73.99%	73.59%	78.26%

Network types	Method	Backbone	Precision	Recall	mAP50
Other	FoveaBox	ResNet50	57.84%	56.97%	57.72%
	Sparse RCNN	ResNet50	69.31%	70.60%	72.44%
	Faster R-CNN	ResNet50	71.38%	70.66%	76.31%
	Cascade R-CNN	ResNet50	72.14%	71.19%	76.48%
YOLO	YOLOv5s	CSPDarknet53	72.67%	73.52%	76.53%
	YOLOv6s	EfficientRep	73.68%	72.54%	77.27%
	YOLOv7	CSPDarknet53	68.62%	68.74%	73.96%
	YOLOv8s	CSPDarknet53	73.20%	73.27%	77.71%
	YOLOXs	EfficientRep	72.55%	73.56%	77.24%
	Ours	SEDF-Net	73.99%	73.59%	78.26%

Abstract

1. Introduction

2. Related work

3. Structure of the ASDD-Net network

3.1 Downsampling adjustment

3.2 Feature fusion optimization

3.2.1 EC2f

3.2.2 HAC3

3.3 Detection head improvements

4. Results

4.1 Datasets and experimental setup

4.2 Performance evaluation metrics

4.3 Experimental results

4.3.1 Ablation study

4.3.2 Main result

4.3.3 Algorithm comparison experiment

4.3.4 Generalization experiments

5. Conclusions

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (17)

Tables (6)

Equations (24)

Optics Express