ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions

Bufang Yang; Lilin Xu; Liekang Zeng; Kaiwei Liu; Siyang Jiang; Wenrui Lu; Hongkai Chen; Xiaofan Jiang; Guoliang Xing; Zhenyu Yan

ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions

Bufang Yang, Lilin Xu, Liekang Zeng, Kaiwei Liu, Siyang Jiang, Wenrui Lu, Hongkai Chen, Xiaofan Jiang, Guoliang Xing, Zhenyu Yan

TL;DR

ContextAgent addresses the need for open-world, context-aware proactive AI by leveraging multimodal sensory data from wearable devices to infer user intent and autonomously initiate tool-based assistance. It introduces a two-stage framework: proactive-oriented context extraction and a context-aware reasoner that generates thought traces, a proactive score, and planned tool chains when $P_S$ crosses a user-defined threshold. The authors also provide ContextAgentBench, a 1,000-sample benchmark across nine daily life scenarios with twenty tools, and demonstrate that ContextAgent achieves state-of-the-art proactive prediction and tool-calling across multiple LLMs, including smaller models. This work highlights the value of combining sensory-perception data with persona context to create unobtrusive, user-centric AI assistants and provides a pathway toward broader, human-centered proactive AI deployments.

Abstract

Recent advances in Large Language Models (LLMs) have propelled intelligent agents from reactive responses to proactive support. While promising, existing proactive agents either rely exclusively on observations from enclosed environments (e.g., desktop UIs) with direct LLM inference or employ rule-based proactive notifications, leading to suboptimal user intent understanding and limited functionality for proactive service. In this paper, we introduce ContextAgent, the first context-aware proactive agent that incorporates extensive sensory contexts surrounding humans to enhance the proactivity of LLM agents. ContextAgent first extracts multi-dimensional contexts from massive sensory perceptions on wearables (e.g., video and audio) to understand user intentions. ContextAgent then leverages the sensory contexts and personas from historical data to predict the necessity for proactive services. When proactive assistance is needed, ContextAgent further automatically calls the necessary tools to assist users unobtrusively. To evaluate this new task, we curate ContextAgentBench, the first benchmark for evaluating context-aware proactive LLM agents, covering 1,000 samples across nine daily scenarios and twenty tools. Experiments on ContextAgentBench show that ContextAgent outperforms baselines by achieving up to 8.5% and 6.0% higher accuracy in proactive predictions and tool calling, respectively. We hope our research can inspire the development of more advanced, human-centric, proactive AI assistants. The code and dataset are publicly available at https://github.com/openaiotlab/ContextAgent.

ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions

TL;DR

Abstract

ContextAgent: Context-Aware Proactive LLM Agents with Open-World Sensory Perceptions

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)