RAG-HAR: Retrieval Augmented Generation-based Human Activity Recognition
Nirhoshan Sivaroopan, Hansi Karunarathna, Chamara Madarasingha, Anura Jayasumana, Kanchana Thilakarathna
TL;DR
RAG-HAR presents a training-free HAR framework that combines retrieval-augmented reasoning with large language models. By converting windowed time-series statistics into text embeddings and indexing them in a vector database, it retrieves semantically similar examples to ground LLM predictions, achieving state-of-the-art results across six benchmarks without dataset-specific training. The authors further enhance performance with prompt optimization and LLM-based activity descriptors, enabling better retrieval and reasoning, and they demonstrate robust open-set recognition and meaningful labeling of unseen activities. This approach offers a scalable, cost-effective alternative to traditional DL HAR, with practical deployment advantages in pervasive sensing environments.
Abstract
Human Activity Recognition (HAR) underpins applications in healthcare, rehabilitation, fitness tracking, and smart environments, yet existing deep learning approaches demand dataset-specific training, large labeled corpora, and significant computational resources.We introduce RAG-HAR, a training-free retrieval-augmented framework that leverages large language models (LLMs) for HAR. RAG-HAR computes lightweight statistical descriptors, retrieves semantically similar samples from a vector database, and uses this contextual evidence to make LLM-based activity identification. We further enhance RAG-HAR by first applying prompt optimization and introducing an LLM-based activity descriptor that generates context-enriched vector databases for delivering accurate and highly relevant contextual information. Along with these mechanisms, RAG-HAR achieves state-of-the-art performance across six diverse HAR benchmarks. Most importantly, RAG-HAR attains these improvements without requiring model training or fine-tuning, emphasizing its robustness and practical applicability. RAG-HAR moves beyond known behaviors, enabling the recognition and meaningful labelling of multiple unseen human activities.
