A Self-Supervised Method for Body Part Segmentation and Keypoint Detection of Rat Images
László Kopácsi, Áron Fóthi, András Lőrincz
TL;DR
The paper tackles the problem of body-part segmentation and keypoint detection for rats under heavy occlusion without relying on manual annotations. It introduces a self-supervised pipeline that first generates automatic annotations from a stationary-camera video using foreground-background segmentation, medial-axis-based features, and watershed-based segmentation, followed by extensive augmentation to simulate occlusions. Two Mask R-CNN-based models are trained on the generated labels to perform instance segmentation, keypoint detection, and body-part segmentation, achieving substantial improvements over a CV-based baseline (e.g., from APs of 53.22/48.91/9.38 to 61.92/77.53/28.87) and demonstrating robustness to occlusions. The work offers practical impact for automated animal-behavior analysis and provides directions for extending to video-based tracking and more advanced architectures like DETR.
Abstract
Recognition of individual components and keypoint detection supported by instance segmentation is crucial to analyze the behavior of agents on the scene. Such systems could be used for surveillance, self-driving cars, and also for medical research, where behavior analysis of laboratory animals is used to confirm the aftereffects of a given medicine. A method capable of solving the aforementioned tasks usually requires a large amount of high-quality hand-annotated data, which takes time and money to produce. In this paper, we propose a method that alleviates the need for manual labeling of laboratory rats. To do so, first, we generate initial annotations with a computer vision-based approach, then through extensive augmentation, we train a deep neural network on the generated data. The final system is capable of instance segmentation, keypoint detection, and body part segmentation even when the objects are heavily occluded.
