Interaction-via-Actions: Cattle Interaction Detection with Joint Learning of Action-Interaction Latent Space

Ren Nakagawa; Yang Yang; Risa Shinoda; Hiroaki Santo; Kenji Oyama; Fumio Okura; Takenao Ohkawa

Interaction-via-Actions: Cattle Interaction Detection with Joint Learning of Action-Interaction Latent Space

Ren Nakagawa, Yang Yang, Risa Shinoda, Hiroaki Santo, Kenji Oyama, Fumio Okura, Takenao Ohkawa

TL;DR

<3-5 sentence high-level summary> Addresses automatic cattle interaction detection from single images despite scarce interaction data. Introduces CattleAct, which decomposes interactions into two individual actions and learns a joint action–interaction latent space using pre-training, contrastive fine-tuning, and alignment losses, supplemented by skeleton-aware augmentation. Proposes a practical multimodal system integrating video and GPS for production pastures and demonstrates improved interaction recognition over baselines in a real-world dataset. Shows that aligning action and interaction representations enhances robustness to occlusion and enables scalable, cost-effective livestock monitoring with potential estrus detection benefits.

Abstract

This paper introduces a method and application for automatically detecting behavioral interactions between grazing cattle from a single image, which is essential for smart livestock management in the cattle industry, such as for detecting estrus. Although interaction detection for humans has been actively studied, a non-trivial challenge lies in cattle interaction detection, specifically the lack of a comprehensive behavioral dataset that includes interactions, as the interactions of grazing cattle are rare events. We, therefore, propose CattleAct, a data-efficient method for interaction detection by decomposing interactions into the combinations of actions by individual cattle. Specifically, we first learn an action latent space from a large-scale cattle action dataset. Then, we embed rare interactions via the fine-tuning of the pre-trained latent space using contrastive learning, thereby constructing a unified latent space of actions and interactions. On top of the proposed method, we develop a practical working system integrating video and GPS inputs. Experiments on a commercial-scale pasture demonstrate the accurate interaction detection achieved by our method compared to the baselines. Our implementation is available at https://github.com/rakawanegan/CattleAct.

Interaction-via-Actions: Cattle Interaction Detection with Joint Learning of Action-Interaction Latent Space

TL;DR

Abstract

Interaction-via-Actions: Cattle Interaction Detection with Joint Learning of Action-Interaction Latent Space

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)