ZipGait: Bridging Skeleton and Silhouette with Diffusion Model for Advancing Gait Recognition
Fanxu Min, Qing Cai, Shaoxiang Guo, Yang Yu, Hao Fan, Junyu Dong
TL;DR
ZipGait bridges skeleton and silhouette information for gait recognition by reconstructing dense body shapes from sparse skeletons using a diffusion-based DiffGait module. A two-stage Perceptual Gait Integration (PGI) then fuses reconstructed silhouettes with skeleton cues to produce robust hybrid gait representations, enabling a lightweight yet effective model-based framework. Across four public benchmarks, ZipGait achieves state-of-the-art performance in both intra- and cross-domain settings and yields notable plug-and-play improvements when embedded into existing skeleton-based methods. The approach demonstrates the potential of cross-modal gait modeling to close the gap with appearance-based methods while maintaining efficiency and flexibility for real-world deployment.
Abstract
Current gait recognition research predominantly focuses on extracting appearance features effectively, but the performance is severely compromised by the vulnerability of silhouettes under unconstrained scenes. Consequently, numerous studies have explored how to harness information from various models, particularly by sufficiently utilizing the intrinsic information of skeleton sequences. While these model-based methods have achieved significant performance, there is still a huge gap compared to appearance-based methods, which implies the potential value of bridging silhouettes and skeletons. In this work, we make the first attempt to reconstruct dense body shapes from discrete skeleton distributions via the diffusion model, demonstrating a new approach that connects cross-modal features rather than focusing solely on intrinsic features to improve model-based methods. To realize this idea, we propose a novel gait diffusion model named DiffGait, which has been designed with four specific adaptations suitable for gait recognition. Furthermore, to effectively utilize the reconstructed silhouettes and skeletons, we introduce Perception Gait Integration (PGI) to integrate different gait features through a two-stage process. Incorporating those modifications leads to an efficient model-based gait recognition framework called ZipGait. Through extensive experiments on four public benchmarks, ZipGait demonstrates superior performance, outperforming the state-of-the-art methods by a large margin under both cross-domain and intra-domain settings, while achieving significant plug-and-play performance improvements.
