Table of Contents
Fetching ...

AlphaChimp: Tracking and Behavior Recognition of Chimpanzees

Xiaoxuan Ma, Yutang Lin, Yuan Xu, Stephan P. Kaufhold, Jack Terwilliger, Andres Meza, Yixin Zhu, Federico Rossano, Yizhou Wang

TL;DR

AlphaChimp, an end-to-end approach that simultaneously detects chimpanzee positions and estimates behavior categories from videos, significantly outperforms existing methods in behavior recognition and achieves approximately 10% higher tracking accuracy and a 20% improvement in behavior recognition compared to state-of-the-art methods.

Abstract

Understanding non-human primate behavior is crucial for improving animal welfare, modeling social behavior, and gaining insights into both distinctly human and shared behaviors. Despite recent advances in computer vision, automated analysis of primate behavior remains challenging due to the complexity of their social interactions and the lack of specialized algorithms. Existing methods often struggle with the nuanced behaviors and frequent occlusions characteristic of primate social dynamics. This study aims to develop an effective method for automated detection, tracking, and recognition of chimpanzee behaviors in video footage. Here we show that our proposed method, AlphaChimp, an end-to-end approach that simultaneously detects chimpanzee positions and estimates behavior categories from videos, significantly outperforms existing methods in behavior recognition. AlphaChimp achieves approximately 10% higher tracking accuracy and a 20% improvement in behavior recognition compared to state-of-the-art methods, particularly excelling in the recognition of social behaviors. This superior performance stems from AlphaChimp's innovative architecture, which integrates temporal feature fusion with a Transformer-based self-attention mechanism, enabling more effective capture and interpretation of complex social interactions among chimpanzees. Our approach bridges the gap between computer vision and primatology, enhancing technical capabilities and deepening our understanding of primate communication and sociality. We release our code and models and hope this will facilitate future research in animal social dynamics. This work contributes to ethology, cognitive science, and artificial intelligence, offering new perspectives on social intelligence.

AlphaChimp: Tracking and Behavior Recognition of Chimpanzees

TL;DR

AlphaChimp, an end-to-end approach that simultaneously detects chimpanzee positions and estimates behavior categories from videos, significantly outperforms existing methods in behavior recognition and achieves approximately 10% higher tracking accuracy and a 20% improvement in behavior recognition compared to state-of-the-art methods.

Abstract

Understanding non-human primate behavior is crucial for improving animal welfare, modeling social behavior, and gaining insights into both distinctly human and shared behaviors. Despite recent advances in computer vision, automated analysis of primate behavior remains challenging due to the complexity of their social interactions and the lack of specialized algorithms. Existing methods often struggle with the nuanced behaviors and frequent occlusions characteristic of primate social dynamics. This study aims to develop an effective method for automated detection, tracking, and recognition of chimpanzee behaviors in video footage. Here we show that our proposed method, AlphaChimp, an end-to-end approach that simultaneously detects chimpanzee positions and estimates behavior categories from videos, significantly outperforms existing methods in behavior recognition. AlphaChimp achieves approximately 10% higher tracking accuracy and a 20% improvement in behavior recognition compared to state-of-the-art methods, particularly excelling in the recognition of social behaviors. This superior performance stems from AlphaChimp's innovative architecture, which integrates temporal feature fusion with a Transformer-based self-attention mechanism, enabling more effective capture and interpretation of complex social interactions among chimpanzees. Our approach bridges the gap between computer vision and primatology, enhancing technical capabilities and deepening our understanding of primate communication and sociality. We release our code and models and hope this will facilitate future research in animal social dynamics. This work contributes to ethology, cognitive science, and artificial intelligence, offering new perspectives on social intelligence.

Paper Structure

This paper contains 39 sections, 26 figures, 12 tables.

Figures (26)

  • Figure 1: Sample frames and annotations from the ChimpACT dataset. We present three video sequences where an infant chimpanzee, named Azibo, is focused. While we also annotate visibility for both the bounding box and the keypoint, these are omitted here for clarity.
  • Figure 2: (a) Kinship of the observed chimpanzee group. Rectangles and ellipses represent males and females, respectively, with arrows flowing from the parents to the child. A missing entry arrow indicates that one of the parents is unknown. Their vertical position relative to the time axis indicates the year of birth. (b) Ethogram with annotated behaviors.
  • Figure 3: Semi-naturalistic habitats at Leipzig Zoo. (a) The aerial view of Leipzig Zoo zoo_all, including both indoor and outdoor enclosure. (b) An example scene inside the indoor enclosure zoo_in. Photo used under CC BY-SA 4.0. (c) An example scene of the outdoor enclosure.
  • Figure 4: (a) Log-scale distribution of annotations per individual (b) Log-scale distribution of annotations per behavior.
  • Figure 5: Keypoint definitions for chimpanzee.
  • ...and 21 more figures