SILVI: Simple Interface for Labeling Video Interactions

Ozan Kanbertay; Richard Vogg; Elif Karakoc; Peter M. Kappeler; Claudia Fichtel; Alexander S. Ecker

SILVI: Simple Interface for Labeling Video Interactions

Ozan Kanbertay, Richard Vogg, Elif Karakoc, Peter M. Kappeler, Claudia Fichtel, Alexander S. Ecker

TL;DR

SILVI addresses the challenge of annotating spatio-temporal interactions in wildlife video data, addressing limitations of existing tools that label either behaviors or localization but not both. It introduces an open-source, Electron-based interface that supports simultaneous action and interaction labeling, track correction, and CSV export to train computer vision models. The tool was demonstrated on red-fronted lemurs across multi-view videos, showing scalability with long video datasets and comprehensive metadata handling. By linking ethogram-based labeling with precise spatial localization, SILVI enables more robust social-behavior analyses and potential applications beyond animal studies. The open-source design invites extension to automated tracking and ID integration.

Abstract

Computer vision methods are increasingly used for the automated analysis of large volumes of video data collected through camera traps, drones, or direct observations of animals in the wild. While recent advances have focused primarily on detecting individual actions, much less work has addressed the detection and annotation of interactions -- a crucial aspect for understanding social and individualized animal behavior. Existing open-source annotation tools support either behavioral labeling without localization of individuals, or localization without the capacity to capture interactions. To bridge this gap, we present SILVI, an open-source labeling software that integrates both functionalities. SILVI enables researchers to annotate behaviors and interactions directly within video data, generating structured outputs suitable for training and validating computer vision models. By linking behavioral ecology with computer vision, SILVI facilitates the development of automated approaches for fine-grained behavioral analyses. Although developed primarily in the context of animal behavior, SILVI could be useful more broadly to annotate human interactions in other videos that require extracting dynamic scene graphs. The software, along with documentation and download instructions, is available at: https://gitlab.gwdg.de/kanbertay/interaction-labelling-app.

SILVI: Simple Interface for Labeling Video Interactions

TL;DR

Abstract

SILVI: Simple Interface for Labeling Video Interactions

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)