Table of Contents
Fetching ...

NDD20: A large-scale few-shot dolphin dataset for coarse and fine-grained categorisation

Cameron Trotter, Georgia Atkinson, Matt Sharpe, Kirsten Richardson, A. Stephen McGough, Nick Wright, Ben Burville, Per Berggren

TL;DR

NDD20 introduces a large-scale, open dolphin dataset with coarse and fine-grained instance annotations for both above- and below-water imagery, addressing a gap in conservation-focused computer vision resources. The dataset enables species- and individual-level photo-id benchmarking and includes anonymized, multi-annotated masks across 44 IDs (above water) and 82 IDs (below water). Baseline results using Mask-RCNN demonstrate competitive instance segmentation performance, validating the dataset's difficulty and realism. The work aims to accelerate conservation research by providing a real-world, field-collected benchmark and encouraging future exploration of fine-grained classification tasks in marine environments.

Abstract

We introduce the Northumberland Dolphin Dataset 2020 (NDD20), a challenging image dataset annotated for both coarse and fine-grained instance segmentation and categorisation. This dataset, the first release of the NDD, was created in response to the rapid expansion of computer vision into conservation research and the production of field-deployable systems suited to extreme environmental conditions -- an area with few open source datasets. NDD20 contains a large collection of above and below water images of two different dolphin species for traditional coarse and fine-grained segmentation. All data contained in NDD20 was obtained via manual collection in the North Sea around the Northumberland coastline, UK. We present experimentation using standard deep learning network architecture trained using NDD20 and report baselines results.

NDD20: A large-scale few-shot dolphin dataset for coarse and fine-grained categorisation

TL;DR

NDD20 introduces a large-scale, open dolphin dataset with coarse and fine-grained instance annotations for both above- and below-water imagery, addressing a gap in conservation-focused computer vision resources. The dataset enables species- and individual-level photo-id benchmarking and includes anonymized, multi-annotated masks across 44 IDs (above water) and 82 IDs (below water). Baseline results using Mask-RCNN demonstrate competitive instance segmentation performance, validating the dataset's difficulty and realism. The work aims to accelerate conservation research by providing a real-world, field-collected benchmark and encouraging future exploration of fine-grained classification tasks in marine environments.

Abstract

We introduce the Northumberland Dolphin Dataset 2020 (NDD20), a challenging image dataset annotated for both coarse and fine-grained instance segmentation and categorisation. This dataset, the first release of the NDD, was created in response to the rapid expansion of computer vision into conservation research and the production of field-deployable systems suited to extreme environmental conditions -- an area with few open source datasets. NDD20 contains a large collection of above and below water images of two different dolphin species for traditional coarse and fine-grained segmentation. All data contained in NDD20 was obtained via manual collection in the North Sea around the Northumberland coastline, UK. We present experimentation using standard deep learning network architecture trained using NDD20 and report baselines results.

Paper Structure

This paper contains 8 sections, 6 figures.

Figures (6)

  • Figure 1: The number of above water images per ID class.
  • Figure 2: Example above water images. Both images contain one mask with the following attributes: Left - object:dolphin, species:WBD, id:11. Right - object:dolphin, species:BND, id:8.
  • Figure 3: The number of below water images per ID class.
  • Figure 4: Example below water images. Both images contain one mask with the following attributes: Left - object:dolphin, id:9, out of focus:false. Right - object:dolphin, id:30, out of focus:false.
  • Figure 5: Differing mAP@IOU values for the best performing instance segmentation model using the above water data.
  • ...and 1 more figures