Table of Contents
Fetching ...

GaitPoint+: A Gait Recognition Network Incorporating Point Cloud Analysis and Recycling

Huantao Ren, Jiajing Chen, Senem Velipasalar

TL;DR

GaitPoint+ addresses the robustness gap in gait recognition under appearance changes by fusing silhouette-based features with skeleton information modeled as a 3D point cloud. It uses a lightweight PointNet-based skeleton module and introduces Recycling Max-Pooling (RMP) to reclaim discarded points, coupled with a multi-term loss that combines triplet objectives and a refinement component; the overall objective is $L = L_{c\_tp} + L_{p\_tp} + L_{g\_tp} + L_{rmp}$. Empirical results on CASIA-B show consistent improvements across state-of-the-art silhouette baselines, particularly in challenging BG and CL scenarios, with additional gains when RMP is applied; OUMVLP experiments indicate the approach generalizes with dataset-dependent effects. The work demonstrates that 3D point-cloud processing of skeletal keypoints can be efficiently integrated with CNN-based silhouette methods to yield more discriminative and robust gait representations, paving the way for broader use of lightweight point-cloud modules in biometric recognition.

Abstract

Gait is a behavioral biometric modality that can be used to recognize individuals by the way they walk from a far distance. Most existing gait recognition approaches rely on either silhouettes or skeletons, while their joint use is underexplored. Features from silhouettes and skeletons can provide complementary information for more robust recognition against appearance changes or pose estimation errors. To exploit the benefits of both silhouette and skeleton features, we propose a new gait recognition network, referred to as the GaitPoint+. Our approach models skeleton key points as a 3D point cloud, and employs a computational complexity-conscious 3D point processing approach to extract skeleton features, which are then combined with silhouette features for improved accuracy. Since silhouette- or CNN-based methods already require considerable amount of computational resources, it is preferable that the key point learning module is faster and more lightweight. We present a detailed analysis of the utilization of every human key point after the use of traditional max-pooling, and show that while elbow and ankle points are used most commonly, many useful points are discarded by max-pooling. Thus, we present a method to recycle some of the discarded points by a Recycling Max-Pooling module, during processing of skeleton point clouds, and achieve further performance improvement. We provide a comprehensive set of experimental results showing that (i) incorporating skeleton features obtained by a point-based 3D point cloud processing approach boosts the performance of three different state-of-the-art silhouette- and CNN-based baselines; (ii) recycling the discarded points increases the accuracy further. Ablation studies are also provided to show the effectiveness and contribution of different components of our approach.

GaitPoint+: A Gait Recognition Network Incorporating Point Cloud Analysis and Recycling

TL;DR

GaitPoint+ addresses the robustness gap in gait recognition under appearance changes by fusing silhouette-based features with skeleton information modeled as a 3D point cloud. It uses a lightweight PointNet-based skeleton module and introduces Recycling Max-Pooling (RMP) to reclaim discarded points, coupled with a multi-term loss that combines triplet objectives and a refinement component; the overall objective is . Empirical results on CASIA-B show consistent improvements across state-of-the-art silhouette baselines, particularly in challenging BG and CL scenarios, with additional gains when RMP is applied; OUMVLP experiments indicate the approach generalizes with dataset-dependent effects. The work demonstrates that 3D point-cloud processing of skeletal keypoints can be efficiently integrated with CNN-based silhouette methods to yield more discriminative and robust gait representations, paving the way for broader use of lightweight point-cloud modules in biometric recognition.

Abstract

Gait is a behavioral biometric modality that can be used to recognize individuals by the way they walk from a far distance. Most existing gait recognition approaches rely on either silhouettes or skeletons, while their joint use is underexplored. Features from silhouettes and skeletons can provide complementary information for more robust recognition against appearance changes or pose estimation errors. To exploit the benefits of both silhouette and skeleton features, we propose a new gait recognition network, referred to as the GaitPoint+. Our approach models skeleton key points as a 3D point cloud, and employs a computational complexity-conscious 3D point processing approach to extract skeleton features, which are then combined with silhouette features for improved accuracy. Since silhouette- or CNN-based methods already require considerable amount of computational resources, it is preferable that the key point learning module is faster and more lightweight. We present a detailed analysis of the utilization of every human key point after the use of traditional max-pooling, and show that while elbow and ankle points are used most commonly, many useful points are discarded by max-pooling. Thus, we present a method to recycle some of the discarded points by a Recycling Max-Pooling module, during processing of skeleton point clouds, and achieve further performance improvement. We provide a comprehensive set of experimental results showing that (i) incorporating skeleton features obtained by a point-based 3D point cloud processing approach boosts the performance of three different state-of-the-art silhouette- and CNN-based baselines; (ii) recycling the discarded points increases the accuracy further. Ablation studies are also provided to show the effectiveness and contribution of different components of our approach.
Paper Structure (24 sections, 5 equations, 4 figures, 8 tables)

This paper contains 24 sections, 5 equations, 4 figures, 8 tables.

Figures (4)

  • Figure 1: An individual's silhouette from different angles when walking normal, carrying a bag and wearing a coat.
  • Figure 2: The network structure of the GaitPoint+ with Recycling Max-pooling (RMP) Module. $L_{p}\__{tp}$, $L_{c}\__{tp}$ and $L_{g}\__{tp}$ are the triplet losses for the permutation invariant, convolutional and gait features, respectively. $F_1$ and $F_2$ are the permutation invariant features obtained by max-pooling from original key points and discarded key points, respectively. $L_{p}\__{ce1}$ and $L_{p}\__{ce2}$ are the cross entropy losses for the permutation invariant $F_1$ and $F_2$, respectively. $L_r$ is refinement loss. FC represents Fully Connected Layer, and two FC layers share parameters.
  • Figure 3: Box plots of the number of human key points used when PointNet is combined with three silhouette-based models. The total number of input key points is 1020 (60 frames and 17 key points in each frame). There are about 75 points used by all models before training. This number increases to 325 when networks are well trained.
  • Figure 4: 17 human key points on a skeleton. The yellow joints are the top five most used points kept after first max-pooling.