Table of Contents
Fetching ...

ALSS-YOLO: An Adaptive Lightweight Channel Split and Shuffling Network for TIR Wildlife Detection in UAV Imagery

Ang He, Xiaobo Li, Ximei Wu, Chengyue Su, Jing Chen, Sheng Xu, Xiaobin Guo

TL;DR

ALSS-YOLO, an efficient and lightweight detector optimized for TIR aerial images, is developed and a novel adaptive lightweight channel split and shuffling (ALSS) module is proposed, which improves the extraction of blurry features, crucial for handling jitter-induced blur and overlapping targets.

Abstract

Unmanned aerial vehicles (UAVs) equipped with thermal infrared (TIR) cameras play a crucial role in combating nocturnal wildlife poaching. However, TIR images often face challenges such as jitter, and wildlife overlap, necessitating UAVs to possess the capability to identify blurred and overlapping small targets. Current traditional lightweight networks deployed on UAVs struggle to extract features from blurry small targets. To address this issue, we developed ALSS-YOLO, an efficient and lightweight detector optimized for TIR aerial images. Firstly, we propose a novel Adaptive Lightweight Channel Split and Shuffling (ALSS) module. This module employs an adaptive channel split strategy to optimize feature extraction and integrates a channel shuffling mechanism to enhance information exchange between channels. This improves the extraction of blurry features, crucial for handling jitter-induced blur and overlapping targets. Secondly, we developed a Lightweight Coordinate Attention (LCA) module that employs adaptive pooling and grouped convolution to integrate feature information across dimensions. This module ensures lightweight operation while maintaining high detection precision and robustness against jitter and target overlap. Additionally, we developed a single-channel focus module to aggregate the width and height information of each channel into four-dimensional channel fusion, which improves the feature representation efficiency of infrared images. Finally, we modify the localization loss function to emphasize the loss value associated with small objects to improve localization accuracy. Extensive experiments on the BIRDSAI and ISOD TIR UAV wildlife datasets show that ALSS-YOLO achieves state-of-the-art performance, Our code is openly available at https://github.com/helloworlder8/computer_vision.

ALSS-YOLO: An Adaptive Lightweight Channel Split and Shuffling Network for TIR Wildlife Detection in UAV Imagery

TL;DR

ALSS-YOLO, an efficient and lightweight detector optimized for TIR aerial images, is developed and a novel adaptive lightweight channel split and shuffling (ALSS) module is proposed, which improves the extraction of blurry features, crucial for handling jitter-induced blur and overlapping targets.

Abstract

Unmanned aerial vehicles (UAVs) equipped with thermal infrared (TIR) cameras play a crucial role in combating nocturnal wildlife poaching. However, TIR images often face challenges such as jitter, and wildlife overlap, necessitating UAVs to possess the capability to identify blurred and overlapping small targets. Current traditional lightweight networks deployed on UAVs struggle to extract features from blurry small targets. To address this issue, we developed ALSS-YOLO, an efficient and lightweight detector optimized for TIR aerial images. Firstly, we propose a novel Adaptive Lightweight Channel Split and Shuffling (ALSS) module. This module employs an adaptive channel split strategy to optimize feature extraction and integrates a channel shuffling mechanism to enhance information exchange between channels. This improves the extraction of blurry features, crucial for handling jitter-induced blur and overlapping targets. Secondly, we developed a Lightweight Coordinate Attention (LCA) module that employs adaptive pooling and grouped convolution to integrate feature information across dimensions. This module ensures lightweight operation while maintaining high detection precision and robustness against jitter and target overlap. Additionally, we developed a single-channel focus module to aggregate the width and height information of each channel into four-dimensional channel fusion, which improves the feature representation efficiency of infrared images. Finally, we modify the localization loss function to emphasize the loss value associated with small objects to improve localization accuracy. Extensive experiments on the BIRDSAI and ISOD TIR UAV wildlife datasets show that ALSS-YOLO achieves state-of-the-art performance, Our code is openly available at https://github.com/helloworlder8/computer_vision.
Paper Structure (20 sections, 34 equations, 19 figures, 9 tables)

This paper contains 20 sections, 34 equations, 19 figures, 9 tables.

Figures (19)

  • Figure 1: Examples of blurred FIR wildlife data photos included in the BIRDSAI datasetBIRDSAI by noise or weather conditions. (a) Blurred by noise. (b) Blurred by weather conditions.
  • Figure 2: The architecture of the ALSS-YOLO detector. CBS denotes Convolution, Batch Normalization, and SiLU activation function. The symbol “k” represents the Kernel size, “s” denotes the Stride, and “p” indicates the Padding.
  • Figure 3: ALSS module structure diagram. Part A contains a convolutional layer or identity connection, and part B is a bottleneck structure with depth convolution. All convolutional layers have a step size of 1, and the input and output feature map resolutions are equal.
  • Figure 4: ALSS module structure diagram. Part A contains max pooling with stride 2, or max pooling with stride 2 concatenating convolutions with stride 1, or convolutions with stride 2, part B is a bottleneck structure with depth convolution, in which the first convolution of the bottleneck structure uses a convolution with a stride of 2. The width and height of the output feature map are half those of the input feature map, and the model has a downsampling effect.
  • Figure 5: Schematic diagram of the structure of the CA and LCA modules: (a) CA module. (b) LCA module.
  • ...and 14 more figures