Table of Contents
Fetching ...

RAGent: Physics-Aware Agentic Reasoning for Training-Free mmWave Human Activity Recognition

Mingda Han, Huanqi Yang, Zehua Sun, Wenhao Li, Yanni Yang, Guoming Zhang, Yetong Cao, Weitao Xu, Pengfei Hu

Abstract

Millimeter-wave (mmWave) radar enables privacy-preserving human activity recognition (HAR), yet real-world deployment remains hindered by costly annotation and poor transferability under domain shift. Although prior efforts partially alleviate these challenges, most still require retraining or adaptation for each new deployment setting. This keeps mmWave HAR in a repeated collect-tune-redeploy cycle, making scalable real-world deployment difficult. In this paper, we present RAGent, a deployment-time training-free framework for mmWave HAR that reformulates recognition as evidence-grounded inference over reusable radar knowledge rather than deployment-specific model optimization. Offline, RAGent constructs a reusable radar knowledge base through constrained cross-modal supervision, where a Vision-Language Model (VLM) transfers activity semantics from synchronized videos to paired radar segments without manual radar annotation. At deployment time, RAGent recognizes activities from radar alone by retrieving physically comparable precedents in an explicit kinematic space and resolving the final label through structured multi-role reasoning. The reasoning protocol is further refined offline through zero-gradient self-evolution. Extensive experiments on a self-collected dataset show that RAGent achieves 93.39% accuracy without per-domain retraining or target-domain adaptation, while generalizing robustly across domains.

RAGent: Physics-Aware Agentic Reasoning for Training-Free mmWave Human Activity Recognition

Abstract

Millimeter-wave (mmWave) radar enables privacy-preserving human activity recognition (HAR), yet real-world deployment remains hindered by costly annotation and poor transferability under domain shift. Although prior efforts partially alleviate these challenges, most still require retraining or adaptation for each new deployment setting. This keeps mmWave HAR in a repeated collect-tune-redeploy cycle, making scalable real-world deployment difficult. In this paper, we present RAGent, a deployment-time training-free framework for mmWave HAR that reformulates recognition as evidence-grounded inference over reusable radar knowledge rather than deployment-specific model optimization. Offline, RAGent constructs a reusable radar knowledge base through constrained cross-modal supervision, where a Vision-Language Model (VLM) transfers activity semantics from synchronized videos to paired radar segments without manual radar annotation. At deployment time, RAGent recognizes activities from radar alone by retrieving physically comparable precedents in an explicit kinematic space and resolving the final label through structured multi-role reasoning. The reasoning protocol is further refined offline through zero-gradient self-evolution. Extensive experiments on a self-collected dataset show that RAGent achieves 93.39% accuracy without per-domain retraining or target-domain adaptation, while generalizing robustly across domains.

Paper Structure

This paper contains 48 sections, 10 equations, 13 figures, 4 tables, 1 algorithm.

Figures (13)

  • Figure 1: Paradigm shift in mmWave HAR.RAGent moves beyond conventional methods by grounding recognition in retrieved precedents and verifiable kinematic evidence through agentic reasoning.
  • Figure 2: Two key challenges in training-free mmWave HAR.
  • Figure 3: RAGent overview. The framework follows an offline-to-online workflow with three stages: (1) Knowledge-Base Construction, (2) Physics-Driven Numerical Retrieval, and (3) Council-of-Experts Reasoning.
  • Figure 4: Radar-video temporal synchronization. The optimal offset is estimated by maximizing the cross-correlation between the radar and video motion-energy envelopes.
  • Figure 5: Activity segmentation from continuous mmWave streams.
  • ...and 8 more figures