Table of Contents
Fetching ...

OmniFall: A Unified Staged-to-Wild Benchmark for Human Fall Detection

David Schneider, Zdravko Marinov, Rafael Baur, Zeyun Zhong, Rodi Düger, Rainer Stiefelhagen

TL;DR

OmniFall tackles the lack of generalizable fall-detection benchmarks by unifying eight staged datasets under a common $10$-class taxonomy and introducing OOPS-Fall for staged-to-wild evaluation. The approach evaluates frozen backbones (I3D, VideoMAE variants) to expose domain shifts across datasets and quantify generalization to real-world falls. Key contributions include the OmniFall dataset, the OOPS-Fall benchmark, standardized cross-subject and cross-view splits, and a thorough analysis of how backbones differ in out-of-distribution performance. The results reveal a substantial gap between controlled and uncontrolled environments, underscoring the need for diverse benchmarks and domain-robust representations to enable reliable real-world fall-detection systems.

Abstract

Current video-based fall detection research mostly relies on small, staged datasets with significant domain biases concerning background, lighting, and camera setup resulting in unknown real-world performance. We introduce OmniFall, unifying eight public fall detection datasets (roughly 14 h of recordings, roughly 42 h of multiview data, 101 subjects, 29 camera views) under a consistent ten-class taxonomy with standardized evaluation protocols. Our benchmark provides complete video segmentation labels and enables fair cross-dataset comparison previously impossible with incompatible annotation schemes. For real-world evaluation we curate OOPS-Fall from genuine accident videos and establish a staged-to-wild protocol measuring generalization from controlled to uncontrolled environments. Experiments with frozen pre-trained backbones such as I3D or VideoMAE reveal significant performance gaps between in-distribution and in-the-wild scenarios, highlighting critical challenges in developing robust fall detection systems. OmniFall Dataset at https://huggingface.co/datasets/simplexsigil2/omnifall , Code at https://github.com/simplexsigil/omnifall-experiments

OmniFall: A Unified Staged-to-Wild Benchmark for Human Fall Detection

TL;DR

OmniFall tackles the lack of generalizable fall-detection benchmarks by unifying eight staged datasets under a common -class taxonomy and introducing OOPS-Fall for staged-to-wild evaluation. The approach evaluates frozen backbones (I3D, VideoMAE variants) to expose domain shifts across datasets and quantify generalization to real-world falls. Key contributions include the OmniFall dataset, the OOPS-Fall benchmark, standardized cross-subject and cross-view splits, and a thorough analysis of how backbones differ in out-of-distribution performance. The results reveal a substantial gap between controlled and uncontrolled environments, underscoring the need for diverse benchmarks and domain-robust representations to enable reliable real-world fall-detection systems.

Abstract

Current video-based fall detection research mostly relies on small, staged datasets with significant domain biases concerning background, lighting, and camera setup resulting in unknown real-world performance. We introduce OmniFall, unifying eight public fall detection datasets (roughly 14 h of recordings, roughly 42 h of multiview data, 101 subjects, 29 camera views) under a consistent ten-class taxonomy with standardized evaluation protocols. Our benchmark provides complete video segmentation labels and enables fair cross-dataset comparison previously impossible with incompatible annotation schemes. For real-world evaluation we curate OOPS-Fall from genuine accident videos and establish a staged-to-wild protocol measuring generalization from controlled to uncontrolled environments. Experiments with frozen pre-trained backbones such as I3D or VideoMAE reveal significant performance gaps between in-distribution and in-the-wild scenarios, highlighting critical challenges in developing robust fall detection systems. OmniFall Dataset at https://huggingface.co/datasets/simplexsigil2/omnifall , Code at https://github.com/simplexsigil/omnifall-experiments

Paper Structure

This paper contains 24 sections, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Examples from selected fall detection datasets.
  • Figure 2: Our annotations and MS-TCN++ farha2019ms action segmentation results.
  • Figure 3: Segmented share of each label within datasets and total single view duration.
  • Figure 4: Pairwise Fréchet Video Distances calculated on features from I3D, VideoMAE, and VideoMAE pretrained on Kinetics400, visualised as heat‑maps. Darker colors indicate higher similarity (smaller distances) between datasets. Numerical values are listed in the supplementary.
  • Figure 5: Low‑dimensional clip embeddings, colour‑coded by dataset and activity. Rows correspond to feature type, columns show T-SNE and H-NNE clustering. Full-size plots in supplementary.
  • ...and 1 more figures