Table of Contents
Fetching ...

Fourier-based Action Recognition for Wildlife Behavior Quantification with Event Cameras

Friedhelm Hamann, Suman Ghosh, Ignacio Juarez Martinez, Tom Hart, Alex Kacelnik, Guillermo Gallego

TL;DR

This work tackles robust action recognition for oscillatory wildlife behavior using event cameras. It proposes a lightweight Fourier-based pipeline that summarizes event data into a signed rate and analyzes its Fourier spectrum with an energy-band classifier and two full-spectrum neural classifiers. The approach achieves competitive results (F1 around 0.54–0.56) with orders of magnitude fewer parameters than a 2D CNN, and can reach higher accuracy on select ROIs, highlighting strong interpretability and suitability for online, low-power deployment. The study also demonstrates robustness under challenging environmental conditions and discusses extensions to other oscillatory phenomena and vibration monitoring.

Abstract

Event cameras are novel bio-inspired vision sensors that measure pixel-wise brightness changes asynchronously instead of images at a given frame rate. They offer promising advantages, namely a high dynamic range, low latency, and minimal motion blur. Modern computer vision algorithms often rely on artificial neural network approaches, which require image-like representations of the data and cannot fully exploit the characteristics of event data. We propose approaches to action recognition based on the Fourier Transform. The approaches are intended to recognize oscillating motion patterns commonly present in nature. In particular, we apply our approaches to a recent dataset of breeding penguins annotated for "ecstatic display", a behavior where the observed penguins flap their wings at a certain frequency. We find that our approaches are both simple and effective, producing slightly lower results than a deep neural network (DNN) while relying just on a tiny fraction of the parameters compared to the DNN (five orders of magnitude fewer parameters). They work well despite the uncontrolled, diverse data present in the dataset. We hope this work opens a new perspective on event-based processing and action recognition.

Fourier-based Action Recognition for Wildlife Behavior Quantification with Event Cameras

TL;DR

This work tackles robust action recognition for oscillatory wildlife behavior using event cameras. It proposes a lightweight Fourier-based pipeline that summarizes event data into a signed rate and analyzes its Fourier spectrum with an energy-band classifier and two full-spectrum neural classifiers. The approach achieves competitive results (F1 around 0.54–0.56) with orders of magnitude fewer parameters than a 2D CNN, and can reach higher accuracy on select ROIs, highlighting strong interpretability and suitability for online, low-power deployment. The study also demonstrates robustness under challenging environmental conditions and discusses extensions to other oscillatory phenomena and vibration monitoring.

Abstract

Event cameras are novel bio-inspired vision sensors that measure pixel-wise brightness changes asynchronously instead of images at a given frame rate. They offer promising advantages, namely a high dynamic range, low latency, and minimal motion blur. Modern computer vision algorithms often rely on artificial neural network approaches, which require image-like representations of the data and cannot fully exploit the characteristics of event data. We propose approaches to action recognition based on the Fourier Transform. The approaches are intended to recognize oscillating motion patterns commonly present in nature. In particular, we apply our approaches to a recent dataset of breeding penguins annotated for "ecstatic display", a behavior where the observed penguins flap their wings at a certain frequency. We find that our approaches are both simple and effective, producing slightly lower results than a deep neural network (DNN) while relying just on a tiny fraction of the parameters compared to the DNN (five orders of magnitude fewer parameters). They work well despite the uncontrolled, diverse data present in the dataset. We hope this work opens a new perspective on event-based processing and action recognition.

Paper Structure

This paper contains 25 sections, 4 equations, 10 figures, 7 tables.

Figures (10)

  • Figure 1: Using Fourier Analysis for action recognition with event cameras. The figure shows a picture of a penguin flapping its wings (Left) and the corresponding behavior acquired by an event camera (Right). The rate at which the event data is produced by the penguin shows a clear oscillatory character. We leverage this observation to build simple and effective classifiers in the Fourier domain of the event data.
  • Figure 2: System overview. The event data in a time window is summarized into a lower dimensional signal (signed event rate) $r[k]$, which is transformed into the Fourier domain, $R[f]$. Classification criteria can be established based on the particular properties of $R[f]$. For example, if the energy in a specific frequency band is higher than a threshold, we assume an ecstatic display is present in the data (Option 1). Likewise, it is possible to train a neural network on the classification task (Option 2).
  • Figure 3: The architectures of the classification networks. Here, $x$ refers to the size of input vector.
  • Figure 4: Visualization of the annotated regions of interest (ROIs, i.e., penguin nests). Left: annotated bounding boxes around the individual penguin nests. Right: events colored in red and blue (according to polarity) over a white canvas. Data from Hamann24cvpr.
  • Figure 5: Examples of EDs recorded by events and grayscale images for different illumination conditions Hamann24cvpr. The high dynamic range of events is a clear advantage in this application.
  • ...and 5 more figures