GateAttentionPose: Enhancing Pose Estimation with Agent Attention and Improved Gated Convolutions

Liang Feng; Zhixuan Shen; Lihua Wen; Shiyao Li; Ming Xu

GateAttentionPose: Enhancing Pose Estimation with Agent Attention and Improved Gated Convolutions

Liang Feng, Zhixuan Shen, Lihua Wen, Shiyao Li, Ming Xu

TL;DR

The Agent Attention module replaces large kernel convolutions, significantly improving computational efficiency while preserving global context modeling, and the Gate-Enhanced Feedforward Block augments feature extraction and processing capabilities, particularly in complex scenes.

Abstract

This paper introduces GateAttentionPose, an innovative approach that enhances the UniRepLKNet architecture for pose estimation tasks. We present two key contributions: the Agent Attention module and the Gate-Enhanced Feedforward Block (GEFB). The Agent Attention module replaces large kernel convolutions, significantly improving computational efficiency while preserving global context modeling. The GEFB augments feature extraction and processing capabilities, particularly in complex scenes. Extensive evaluations on COCO and MPII datasets demonstrate that GateAttentionPose outperforms existing state-of-the-art methods, including the original UniRepLKNet, achieving superior or comparable results with improved efficiency. Our approach offers a robust solution for pose estimation across diverse applications, including autonomous driving, human motion capture, and virtual reality.

GateAttentionPose: Enhancing Pose Estimation with Agent Attention and Improved Gated Convolutions

TL;DR

Abstract

Paper Structure (17 sections, 5 equations, 2 figures, 3 tables)

This paper contains 17 sections, 5 equations, 2 figures, 3 tables.

Introduction
Related Work
Pose Estimation Approaches
Challenges in Crowded Scene Pose Estimation
Innovations in Attention Mechanisms and Convolutional Techniques
Methodology
Overall Architecture
GLACE Module Optimization
Advanced Feature Extraction Backbone
Multi-Scale Feature Integration and Upsampling
Decoder and Loss Function
Experiments
COCO Benchmark
MPII Benchmark
Ablation Studies
...and 2 more sections

Figures (2)

Figure 1: The comparison of GateAttentionPose and advanced methods on the COCO test-dev2017 set regarding model size and precision. The size of each bubble represents the input size of the model.
Figure 2: The overall network architecture of our GateAttentionPose, as well as the (a) Downsample Block (DSB), the (b) SENet Block (SEBlock), and the (c) Gate-Enhanced Feedforward Block (GEFB).

GateAttentionPose: Enhancing Pose Estimation with Agent Attention and Improved Gated Convolutions

TL;DR

Abstract

GateAttentionPose: Enhancing Pose Estimation with Agent Attention and Improved Gated Convolutions

Authors

TL;DR

Abstract

Table of Contents

Figures (2)