GaitGS: Temporal Feature Learning in Granularity and Span Dimension for Gait Recognition
Haijun Xiong, Yunze Deng, Bin Feng, Xinggang Wang, Wenyu Liu
TL;DR
GaitGS addresses the gap in gait recognition by jointly modeling temporal information across multiple granularity levels and temporal spans. It introduces the Multi-Granularity Feature Extractor (MGFE) to capture micro- and macro-motion and the Multi-Span Feature Extractor (MSFE) to extract local and global temporal cues, complemented by Prior Information Embedding Generation (PIEG) and a transformer-based Global-information Capture Module (GCM) with grouped-convolution positional encoding. The approach achieves state-of-the-art results on CASIA-B and OU-MVLP, demonstrating robustness to variations in speed and appearance and offering strong cross-view performance. This work advances practical gait recognition by integrating multi-dimensional temporal cues and provides code to support reproducibility.
Abstract
Gait recognition, a growing field in biological recognition technology, utilizes distinct walking patterns for accurate individual identification. However, existing methods lack the incorporation of temporal information. To reach the full potential of gait recognition, we advocate for the consideration of temporal features at varying granularities and spans. This paper introduces a novel framework, GaitGS, which aggregates temporal features simultaneously in both granularity and span dimensions. Specifically, the Multi-Granularity Feature Extractor (MGFE) is designed to capture micro-motion and macro-motion information at fine and coarse levels respectively, while the Multi-Span Feature Extractor (MSFE) generates local and global temporal representations. Through extensive experiments on two datasets, our method demonstrates state-of-the-art performance, achieving Rank-1 accuracy of 98.2%, 96.5%, and 89.7% on CASIA-B under different conditions, and 97.6% on OU-MVLP. The source code will be available at https://github.com/Haijun-Xiong/GaitGS.
