An LLM-Powered Agent for Physiological Data Analysis: A Case Study on PPG-based Heart Rate Estimation
Mohammad Feli, Iman Azimi, Pasi Liljeberg, Amir M. Rahmani
TL;DR
The paper addresses the challenge of extracting reliable health insights from physiological time-series data using large language models (LLMs) despite token limits and limited numerical reasoning. It introduces an OpenCHA-based LLM-powered agent that acts as an orchestrator, connecting a user interface, data sources, and analytical tools under a Tree of Thought prompting regime with GPT-3.5-turbo to deliver analytically grounded results. In a case study on HR estimation from PPG, the agent substantially outperforms OpenAI GPT-4o-family baselines, achieving $MAE=2.83$, $RMSE=5.47$, $MAPE=0.04$, and $MAD=1.90$, with far fewer outliers and tighter agreement with ECG ground truth. This work demonstrates a practical, reproducible pathway for integrating LLMs with established physiological analysis pipelines to enable context-aware, reliable remote health monitoring, and it provides public code for further research.
Abstract
Large language models (LLMs) are revolutionizing healthcare by improving diagnosis, patient care, and decision support through interactive communication. More recently, they have been applied to analyzing physiological time-series like wearable data for health insight extraction. Existing methods embed raw numerical sequences directly into prompts, which exceeds token limits and increases computational costs. Additionally, some studies integrated features extracted from time-series in textual prompts or applied multimodal approaches. However, these methods often produce generic and unreliable outputs due to LLMs' limited analytical rigor and inefficiency in interpreting continuous waveforms. In this paper, we develop an LLM-powered agent for physiological time-series analysis aimed to bridge the gap in integrating LLMs with well-established analytical tools. Built on the OpenCHA, an open-source LLM-powered framework, our agent powered by OpenAI's GPT-3.5-turbo model features an orchestrator that integrates user interaction, data sources, and analytical tools to generate accurate health insights. To evaluate its effectiveness, we implement a case study on heart rate (HR) estimation from Photoplethysmogram (PPG) signals using a dataset of PPG and Electrocardiogram (ECG) recordings in a remote health monitoring study. The agent's performance is benchmarked against OpenAI GPT-4o-mini and GPT-4o, with ECG serving as the gold standard for HR estimation. Results demonstrate that our agent significantly outperforms benchmark models by achieving lower error rates and more reliable HR estimations. The agent implementation is publicly available on GitHub.
