Fourier-based Action Recognition for Wildlife Behavior Quantification with Event Cameras
Friedhelm Hamann, Suman Ghosh, Ignacio Juarez Martinez, Tom Hart, Alex Kacelnik, Guillermo Gallego
TL;DR
This work tackles robust action recognition for oscillatory wildlife behavior using event cameras. It proposes a lightweight Fourier-based pipeline that summarizes event data into a signed rate and analyzes its Fourier spectrum with an energy-band classifier and two full-spectrum neural classifiers. The approach achieves competitive results (F1 around 0.54–0.56) with orders of magnitude fewer parameters than a 2D CNN, and can reach higher accuracy on select ROIs, highlighting strong interpretability and suitability for online, low-power deployment. The study also demonstrates robustness under challenging environmental conditions and discusses extensions to other oscillatory phenomena and vibration monitoring.
Abstract
Event cameras are novel bio-inspired vision sensors that measure pixel-wise brightness changes asynchronously instead of images at a given frame rate. They offer promising advantages, namely a high dynamic range, low latency, and minimal motion blur. Modern computer vision algorithms often rely on artificial neural network approaches, which require image-like representations of the data and cannot fully exploit the characteristics of event data. We propose approaches to action recognition based on the Fourier Transform. The approaches are intended to recognize oscillating motion patterns commonly present in nature. In particular, we apply our approaches to a recent dataset of breeding penguins annotated for "ecstatic display", a behavior where the observed penguins flap their wings at a certain frequency. We find that our approaches are both simple and effective, producing slightly lower results than a deep neural network (DNN) while relying just on a tiny fraction of the parameters compared to the DNN (five orders of magnitude fewer parameters). They work well despite the uncontrolled, diverse data present in the dataset. We hope this work opens a new perspective on event-based processing and action recognition.
