Detection and Identification of Penguins Using Appearance and Motion Features

Kasumi Seko; Hiroki Kinoshita; Raj Rajeshwar Malinda; Hiroaki Kawashima

Detection and Identification of Penguins Using Appearance and Motion Features

Kasumi Seko, Hiroki Kinoshita, Raj Rajeshwar Malinda, Hiroaki Kawashima

TL;DR

This study proposes a framework that enhances both detection and identification performance by integrating appearance and motion features in YOLO11, and introduces a tracklet-based contrastive learning approach applied after tracking.

Abstract

In animal facilities, continuous surveillance of penguins is essential yet technically challenging due to their homogeneous visual characteristics, rapid and frequent posture changes, and substantial environmental noise such as water reflections. In this study, we propose a framework that enhances both detection and identification performance by integrating appearance and motion features. For detection, we adapted YOLO11 to process consecutive frames to overcome the lack of temporal consistency in single-frame detectors. This approach leverages motion cues to detect targets even when distinct visual features are obscured. Our evaluation shows that fine-tuning the model with two-frame inputs improves mAP@0.5 from 0.922 to 0.933, outperforming the baseline, and successfully recovers individuals that are indistinguishable in static images. For identification, we introduce a tracklet-based contrastive learning approach applied after tracking. Through qualitative visualization, we demonstrate that the method produces coherent feature embeddings, bringing samples from the same individual closer in the feature space, suggesting the potential for mitigating ID switching.

Detection and Identification of Penguins Using Appearance and Motion Features

TL;DR

Abstract

Paper Structure (24 sections, 5 figures, 5 tables)

This paper contains 24 sections, 5 figures, 5 tables.

Introduction
Related work
Detection methods utilizing temporal information
Application to animal video analysis
Motion-aware penguin detection
Input configurations
Initialization of model parameters
Experiment settings
Evaluation with RGB image input
Effectiveness of initialization methods
Influence of reference period length
Evaluation with inter-frame difference input
Impact of initialization on difference inputs
Qualitative evaluation
Detection of targets with poor visual features
...and 9 more sections

Figures (5)

Figure 1: Architecture of the proposed detection method
Figure 2: Detection results during swimming. Individuals difficult to distinguish in still images (baseline, left) are detected by utilizing video information (proposed method, right).
Figure 3: Detection results in background regions unseen during training. Moving individuals were detected in both methods (top), whereas stationary individuals were not detected (bottom), illustrating the contribution of motion cues.
Figure 4: Visualization of feature embeddings using t-SNE
Figure 5: Grad-CAM visualization examples for IDs 15 and 21

Detection and Identification of Penguins Using Appearance and Motion Features

TL;DR

Abstract

Detection and Identification of Penguins Using Appearance and Motion Features

Authors

TL;DR

Abstract

Table of Contents

Figures (5)