Table of Contents
Fetching ...

PanAf20K: A Large Video Dataset for Wild Ape Detection and Behaviour Recognition

Otto Brookes, Majid Mirmehdi, Colleen Stephens, Samuel Angedakin, Katherine Corogenes, Dervla Dowd, Paula Dieguez, Thurston C. Hicks, Sorrel Jones, Kevin Lee, Vera Leinert, Juan Lapuente, Maureen S. McCarthy, Amelia Meier, Mizuki Murai, Emmanuelle Normand, Virginie Vergnes, Erin G. Wessling, Roman M. Wittig, Kevin Langergraber, Nuria Maldonado, Xinyu Yang, Klaus Zuberbuhler, Christophe Boesch, Mimi Arandjelovic, Hjalmar Kuhl, Tilo Burghardt

TL;DR

PanAf20K introduces the largest open-access video collection of wild great apes, comprising over 7 million frames across ~20,000 camera-trap videos from 14 field sites, to support ape detection and behaviour recognition in natural habitats. The dataset couples PanAf20K (multi-label behaviour annotations) with PanAf500 (fine-grained, frame-level annotations including full-body location and intra-video IDs) to benchmark detection and multi-label recognition using diverse state-of-the-art architectures. Benchmark results reveal the value of in-domain pretraining and underscore tail-class challenges, with long-tail strategies improving rare behaviour recognition but leaving a substantial gap for less frequent actions. Overall, PanAf20K provides a scalable, ecologically valid platform for AI-enabled conservation analytics, enabling robust population assessment and behavioural studies in great apes.

Abstract

We present the PanAf20K dataset, the largest and most diverse open-access annotated video dataset of great apes in their natural environment. It comprises more than 7 million frames across ~20,000 camera trap videos of chimpanzees and gorillas collected at 14 field sites in tropical Africa as part of the Pan African Programme: The Cultured Chimpanzee. The footage is accompanied by a rich set of annotations and benchmarks making it suitable for training and testing a variety of challenging and ecologically important computer vision tasks including ape detection and behaviour recognition. Furthering AI analysis of camera trap information is critical given the International Union for Conservation of Nature now lists all species in the great ape family as either Endangered or Critically Endangered. We hope the dataset can form a solid basis for engagement of the AI community to improve performance, efficiency, and result interpretation in order to support assessments of great ape presence, abundance, distribution, and behaviour and thereby aid conservation efforts.

PanAf20K: A Large Video Dataset for Wild Ape Detection and Behaviour Recognition

TL;DR

PanAf20K introduces the largest open-access video collection of wild great apes, comprising over 7 million frames across ~20,000 camera-trap videos from 14 field sites, to support ape detection and behaviour recognition in natural habitats. The dataset couples PanAf20K (multi-label behaviour annotations) with PanAf500 (fine-grained, frame-level annotations including full-body location and intra-video IDs) to benchmark detection and multi-label recognition using diverse state-of-the-art architectures. Benchmark results reveal the value of in-domain pretraining and underscore tail-class challenges, with long-tail strategies improving rare behaviour recognition but leaving a substantial gap for less frequent actions. Overall, PanAf20K provides a scalable, ecologically valid platform for AI-enabled conservation analytics, enabling robust population assessment and behavioural studies in great apes.

Abstract

We present the PanAf20K dataset, the largest and most diverse open-access annotated video dataset of great apes in their natural environment. It comprises more than 7 million frames across ~20,000 camera trap videos of chimpanzees and gorillas collected at 14 field sites in tropical Africa as part of the Pan African Programme: The Cultured Chimpanzee. The footage is accompanied by a rich set of annotations and benchmarks making it suitable for training and testing a variety of challenging and ecologically important computer vision tasks including ape detection and behaviour recognition. Furthering AI analysis of camera trap information is critical given the International Union for Conservation of Nature now lists all species in the great ape family as either Endangered or Critically Endangered. We hope the dataset can form a solid basis for engagement of the AI community to improve performance, efficiency, and result interpretation in order to support assessments of great ape presence, abundance, distribution, and behaviour and thereby aid conservation efforts.
Paper Structure (10 sections, 15 figures, 4 tables)

This paper contains 10 sections, 15 figures, 4 tables.

Figures (15)

  • Figure 1: PanAf20K Visual Overview. We present the largest and most diverse open-access video dataset of great apes in the wild. It comprises $\sim$20,000 videos and more than 7 million frames extracted from camera traps at 14 study sites spanning 6 African countries. Shown are 25 representative still frames from the dataset highlighting its diversity with respect to many important aspects such as behavioural activities, species, number of apes, habitat, day/night recordings, scene lighting, and more.
  • Figure 2: Manually annotated full-body location, species and behavioural action labels. Sample frames extracted from PanAf20K videos with species (row 1) and behavioural action annotations (row 2) displayed. Green bounding boxes indicate the full-body location of an ape. Species and behavioural action annotations are shown in the corresponding text.
  • Figure 3: Number of Apes & Bounding Box Size Distribution in the PanAf500 Data. The top row shows the distribution of apes across frames and videos in (a) and (b), respectively, while the distribution of bounding box sizes is shown in (c). The middle row shows still frame examples of videos containing one, two, four and eight apes (viewing from left to right). The bottom row demonstrates still frames with bounding boxes of various sizes; the colour of bounding box and associated number represent the intra-video individual IDs.
  • Figure 4: Behavioural Actions in the PanAf500 Data. Examples of each one of the nine behavioural action classes (right) and their distribution (left) across 500 videos. The total number of per-frame annotations for each behavioural action class is shown on top of the corresponding bar.
  • Figure 5: PanAf20K Behaviour Examples. Triplets of example frames for six categories (i.e., feeding, travel, camera reaction, social interaction, chimp carrying and tool use) in the PanAf20K dataset are shown. Note that camera reaction, social interaction and chimp carrying have been abbreviated to reaction, social and carrying, respectively.
  • ...and 10 more figures