Pose Splatter: A 3D Gaussian Splatting Model for Quantifying Animal Pose and Appearance
Jack Goffinet, Youngjo Min, Carlo Tomasi, David E. Carlson
TL;DR
Pose Splatter introduces a scalable, annotation-free framework for reconstructing and quantifying full 3D animal pose and appearance using shape carving and 3D Gaussian splatting. It replaces per-frame optimization and manual labeling with a feed-forward pipeline that refines a voxel prior through a stacked U-Net and renders via Gaussian splats, achieving accurate geometry across mouse, rat, and zebra finch with sparse views. A rotation-invariant visual embedding derived from spherical harmonics provides a compact, informative descriptor for downstream behavioral analyses, and experiments show superior cross-view generalization and subtle movement capture compared with keypoint baselines. The approach enables high-resolution, longitudinal behavioral studies by significantly reducing annotation and computation bottlenecks, with practical implications for mapping genotype and neural activity to micro-behavior.
Abstract
Accurate and scalable quantification of animal pose and appearance is crucial for studying behavior. Current 3D pose estimation techniques, such as keypoint- and mesh-based techniques, often face challenges including limited representational detail, labor-intensive annotation requirements, and expensive per-frame optimization. These limitations hinder the study of subtle movements and can make large-scale analyses impractical. We propose Pose Splatter, a novel framework leveraging shape carving and 3D Gaussian splatting to model the complete pose and appearance of laboratory animals without prior knowledge of animal geometry, per-frame optimization, or manual annotations. We also propose a novel rotation-invariant visual embedding technique for encoding pose and appearance, designed to be a plug-in replacement for 3D keypoint data in downstream behavioral analyses. Experiments on datasets of mice, rats, and zebra finches show Pose Splatter learns accurate 3D animal geometries. Notably, Pose Splatter represents subtle variations in pose, provides better low-dimensional pose embeddings over state-of-the-art as evaluated by humans, and generalizes to unseen data. By eliminating annotation and per-frame optimization bottlenecks, Pose Splatter enables analysis of large-scale, longitudinal behavior needed to map genotype, neural activity, and micro-behavior at unprecedented resolution.
