Table of Contents
Fetching ...

Improved detection of discarded fish species through BoxAL active learning

Maria Sokolova, Pieter M. Blok, Angelo Mencarelli, Arjan Vroegop, Aloysius van Helmond, Gert Kootstra

TL;DR

This paper presents BoxAL, an active-learning framework that injects epistemic uncertainty estimation into Faster R-CNN via Monte-Carlo dropout to efficiently select informative images for annotating discarded fish on trawlers. By decomposing uncertainty into semantic, spatial, and occurrence components and aggregating them into image-level certainty, BoxAL reduces labeling needs while improving mean AP on a challenging, highly variable dataset. The approach achieves a final mAP of $39.0\pm1.6$ with certainty-based sampling, outperforming random sampling ($34.8\pm1.8$) and reaching comparable performance with $700$ labeled images, i.e., $400$ fewer than the random baseline. The work demonstrates practical gains for electronic monitoring of bycatch, offering a scalable, open-source solution to support the transition from landing obligations to documentation obligations in fisheries.

Abstract

In recent years, powerful data-driven deep-learning techniques have been developed and applied for automated catch registration. However, these methods are dependent on the labelled data, which is time-consuming, labour-intensive, expensive to collect and need expert knowledge. In this study, we present an active learning technique, named BoxAL, which includes estimation of epistemic certainty of the Faster R-CNN object-detection model. The method allows selecting the most uncertain training images from an unlabeled pool, which are then used to train the object-detection model. To evaluate the method, we used an open-source image dataset obtained with a dedicated image-acquisition system developed for commercial trawlers targeting demersal species. We demonstrated, that our approach allows reaching the same object-detection performance as with the random sampling using 400 fewer labelled images. Besides, mean AP score was significantly higher at the last training iteration with 1100 training images, specifically, 39.0±1.6 and 34.8±1.8 for certainty-based sampling and random sampling, respectively. Additionally, we showed that epistemic certainty is a suitable method to sample images that the current iteration of the model cannot deal with yet. Our study additionally showed that the sampled new data is more valuable for training than the remaining unlabeled data. Our software is available on https://github.com/pieterblok/boxal.

Improved detection of discarded fish species through BoxAL active learning

TL;DR

This paper presents BoxAL, an active-learning framework that injects epistemic uncertainty estimation into Faster R-CNN via Monte-Carlo dropout to efficiently select informative images for annotating discarded fish on trawlers. By decomposing uncertainty into semantic, spatial, and occurrence components and aggregating them into image-level certainty, BoxAL reduces labeling needs while improving mean AP on a challenging, highly variable dataset. The approach achieves a final mAP of with certainty-based sampling, outperforming random sampling () and reaching comparable performance with labeled images, i.e., fewer than the random baseline. The work demonstrates practical gains for electronic monitoring of bycatch, offering a scalable, open-source solution to support the transition from landing obligations to documentation obligations in fisheries.

Abstract

In recent years, powerful data-driven deep-learning techniques have been developed and applied for automated catch registration. However, these methods are dependent on the labelled data, which is time-consuming, labour-intensive, expensive to collect and need expert knowledge. In this study, we present an active learning technique, named BoxAL, which includes estimation of epistemic certainty of the Faster R-CNN object-detection model. The method allows selecting the most uncertain training images from an unlabeled pool, which are then used to train the object-detection model. To evaluate the method, we used an open-source image dataset obtained with a dedicated image-acquisition system developed for commercial trawlers targeting demersal species. We demonstrated, that our approach allows reaching the same object-detection performance as with the random sampling using 400 fewer labelled images. Besides, mean AP score was significantly higher at the last training iteration with 1100 training images, specifically, 39.0±1.6 and 34.8±1.8 for certainty-based sampling and random sampling, respectively. Additionally, we showed that epistemic certainty is a suitable method to sample images that the current iteration of the model cannot deal with yet. Our study additionally showed that the sampled new data is more valuable for training than the remaining unlabeled data. Our software is available on https://github.com/pieterblok/boxal.
Paper Structure (15 sections, 8 equations, 6 figures, 1 table)

This paper contains 15 sections, 8 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Examples of images from the dataset https://doi.org/10.4121/16622566.v1
  • Figure 2: BoxAL architecture with the dropout layer included in the head of the model. During Monte-Carlo inference the drop-out layers are activated in the Box head ($p = 0.75$) and $n$ ($n=15$) forward passes are performed, resulting in $n$ predictions per object in the image. After all forward passes are complete, the predictions per object are then used for semantic, spatial and occurrence certainties calculation.
  • Figure 3: Iterative training procedure of BoxAL. $\mathcal{W}_0 \leftarrow \texttt{initNetwork}()$ (Faster R-CNN with ResNeXt-101 backbone pre-trained on ImageNet). $\textbf{T}_i$ is the set of images in the training set at iteration $i$; $\textbf{P}_i$ is the set of images in the unlabelled pool at iteration $i$; $\textbf{S}_i$ is the set of selected image; $\mathcal{W}_i$ are the weights of the detection network at iteration $i$; $E= E + 5$ epochs, where the initial number of epochs was set to 5.
  • Figure 4: Performance means of the BoxAL active learning with minimum certainty sampling (green solid line) and the random sampling (pink solid line). The coloured areas around the lines represent the 95% confidence intervals around the means at five repetitions. The blue solid line is the performance of the Faster R-CNN model trained on the entire unlabeled pool (3005 images).
  • Figure 5: Relationship between F1 scores and certainty values for the training images sampled during ten iterations. Every point in the scatterplot corresponds to calculated F1 score and certainty ($C_{min}$) for a sampled image. Horizontal error bars correspond to certainty standard deviation; vertical error bars correspond to F1 score standard deviation.
  • ...and 1 more figures