Table of Contents
Fetching ...

A method for detecting dead fish on large water surfaces based on improved YOLOv10

Qingbin Tian, Yukang Huo, Mingyuan Yao, Haihua Wang

TL;DR

The paper tackles real-time detection of dead fish on large water surfaces, addressing small-object and dynamic-water challenges. It proposes FN-YOLO, an improved, lightweight YOLOv10-based detector that uses a FasterNet backbone, CSPStage-based neck, and an additional small-object detection head to enhance multi-scale feature fusion. Ablation studies show substantial performance gains over baselines, with FN-YOLO achieving $P=95.7 ext{%}$, $R=94.5 ext{%}$, $AP_{50}=97.5 ext{%}$, and $AP_{50-95}=60.6 ext{%}$ while maintaining a compact $2.87$M parameter count and 36 FPS, indicating strong suitability for embedded deployment. The results suggest significant practical impact for rapid, scalable dead-fish monitoring in aquaculture, enabling timely cleanup and reducing environmental and health risks.

Abstract

Dead fish frequently appear on the water surface due to various factors. If not promptly detected and removed, these dead fish can cause significant issues such as water quality deterioration, ecosystem damage, and disease transmission. Consequently, it is imperative to develop rapid and effective detection methods to mitigate these challenges. Conventional methods for detecting dead fish are often constrained by manpower and time limitations, struggling to effectively manage the intricacies of aquatic environments. This paper proposes an end-to-end detection model built upon an enhanced YOLOv10 framework, designed specifically to swiftly and precisely detect deceased fish across extensive water surfaces.Key enhancements include: (1) Replacing YOLOv10's backbone network with FasterNet to reduce model complexity while maintaining high detection accuracy; (2) Improving feature fusion in the Neck section through enhanced connectivity methods and replacing the original C2f module with CSPStage modules; (3) Adding a compact target detection head to enhance the detection performance of smaller objects. Experimental results demonstrate significant improvements in P(precision), R(recall), and AP(average precision) compared to the baseline model YOLOv10n. Furthermore, our model outperforms other models in the YOLO series by significantly reducing model size and parameter count, while sustaining high inference speed and achieving optimal AP performance. The model facilitates rapid and accurate detection of dead fish in large-scale aquaculture systems. Finally, through ablation experiments, we systematically analyze and assess the contribution of each model component to the overall system performance.

A method for detecting dead fish on large water surfaces based on improved YOLOv10

TL;DR

The paper tackles real-time detection of dead fish on large water surfaces, addressing small-object and dynamic-water challenges. It proposes FN-YOLO, an improved, lightweight YOLOv10-based detector that uses a FasterNet backbone, CSPStage-based neck, and an additional small-object detection head to enhance multi-scale feature fusion. Ablation studies show substantial performance gains over baselines, with FN-YOLO achieving , , , and while maintaining a compact M parameter count and 36 FPS, indicating strong suitability for embedded deployment. The results suggest significant practical impact for rapid, scalable dead-fish monitoring in aquaculture, enabling timely cleanup and reducing environmental and health risks.

Abstract

Dead fish frequently appear on the water surface due to various factors. If not promptly detected and removed, these dead fish can cause significant issues such as water quality deterioration, ecosystem damage, and disease transmission. Consequently, it is imperative to develop rapid and effective detection methods to mitigate these challenges. Conventional methods for detecting dead fish are often constrained by manpower and time limitations, struggling to effectively manage the intricacies of aquatic environments. This paper proposes an end-to-end detection model built upon an enhanced YOLOv10 framework, designed specifically to swiftly and precisely detect deceased fish across extensive water surfaces.Key enhancements include: (1) Replacing YOLOv10's backbone network with FasterNet to reduce model complexity while maintaining high detection accuracy; (2) Improving feature fusion in the Neck section through enhanced connectivity methods and replacing the original C2f module with CSPStage modules; (3) Adding a compact target detection head to enhance the detection performance of smaller objects. Experimental results demonstrate significant improvements in P(precision), R(recall), and AP(average precision) compared to the baseline model YOLOv10n. Furthermore, our model outperforms other models in the YOLO series by significantly reducing model size and parameter count, while sustaining high inference speed and achieving optimal AP performance. The model facilitates rapid and accurate detection of dead fish in large-scale aquaculture systems. Finally, through ablation experiments, we systematically analyze and assess the contribution of each model component to the overall system performance.
Paper Structure (12 sections, 8 equations, 15 figures, 3 tables)

This paper contains 12 sections, 8 equations, 15 figures, 3 tables.

Figures (15)

  • Figure 1: Experimental data collection system.
  • Figure 2: Structure diagram of YOLOv10.
  • Figure 3: Illustrations of different convolution operations. (a) Standard Convolution: A filter is applied across the entire input feature map to produce an output feature map. (b) Depthwise/Group Convolution: Depthwise convolution applies a single filter per input channel, and group convolution divides input channels into groups, applying separate filters within each group. (c) Partial Convolution: The convolution operation is applied only to the unmasked regions of the input, effectively reducing memory access times.
  • Figure 4: Visualization of feature maps in an intermediate layer of a pre-trained ResNet50, with the top-left image as the input. Qualitatively, we can see the high redundancies across different channels.
  • Figure 5: Structure of FasterNet.
  • ...and 10 more figures