FineBio: A Fine-Grained Video Dataset of Biological Experiments with Hierarchical Annotation
Takuma Yagi, Misaki Ohashi, Yifei Huang, Ryosuke Furuta, Shungo Adachi, Toutai Mitsuyama, Yoichi Sato
TL;DR
FineBio addresses the need for accurate, reproducible documentation of biological experiments by providing a fine-grained, multi-view video dataset with hierarchical annotations across steps, atomic operations, object locations, and manipulation states. The authors collect 226 trials over 14.5 hours from 32 participants across seven protocols, yielding 3.5K steps, 50K atomic operations, and 72K bounding boxes, with frames sampled to capture challenging hand-object interactions. Baseline experiments on step segmentation, atomic operation detection, object detection, and manipulated/affected object detection reveal strong performance at higher levels but notable difficulties in boundary precision and fine-grained state reasoning, underscoring the need for multi-granularity modeling. The dataset and code, available at the project repository, aim to catalyze progress in structured activity understanding and laboratory automation while acknowledging limitations from using mock experiments and proposing future directions toward real-material datasets.
Abstract
In the development of science, accurate and reproducible documentation of the experimental process is crucial. Automatic recognition of the actions in experiments from videos would help experimenters by complementing the recording of experiments. Towards this goal, we propose FineBio, a new fine-grained video dataset of people performing biological experiments. The dataset consists of multi-view videos of 32 participants performing mock biological experiments with a total duration of 14.5 hours. One experiment forms a hierarchical structure, where a protocol consists of several steps, each further decomposed into a set of atomic operations. The uniqueness of biological experiments is that while they require strict adherence to steps described in each protocol, there is freedom in the order of atomic operations. We provide hierarchical annotation on protocols, steps, atomic operations, object locations, and their manipulation states, providing new challenges for structured activity understanding and hand-object interaction recognition. To find out challenges on activity understanding in biological experiments, we introduce baseline models and results on four different tasks, including (i) step segmentation, (ii) atomic operation detection (iii) object detection, and (iv) manipulated/affected object detection. Dataset and code are available from https://github.com/aistairc/FineBio.
