CAMPHOR: Collaborative Agents for Multi-input Planning and High-Order Reasoning On Device
Yicheng Fu, Raviteja Anantha, Jianpeng Cheng
TL;DR
CAMPHOR addresses the latency and privacy constraints of deploying language-enabled assistants on devices by introducing a hierarchical, on-device, multi-agent framework that coordinates specialized agents for personal context retrieval, device information, and task completion. The system uses dynamic prompt construction and prompt compression to fit memory budgets while maintaining high reasoning accuracy, and it validates the approach with the CAMPHOR dataset of device-grounded trajectories. Fine-tuned small language models outperform closed-source LLM baselines on task-completion metrics and eliminate server-device communication costs, underscoring the practical viability of private, on-device reasoning. The work offers a concrete dataset, a scalable architecture, and targeted compression techniques that together enable responsive, privacy-preserving mobile assistants with strong task performance.
Abstract
While server-side Large Language Models (LLMs) demonstrate proficiency in function calling and complex reasoning, deploying Small Language Models (SLMs) directly on devices brings opportunities to improve latency and privacy but also introduces unique challenges for accuracy and memory. We introduce CAMPHOR, an innovative on-device SLM multi-agent framework designed to handle multiple user inputs and reason over personal context locally, ensuring privacy is maintained. CAMPHOR employs a hierarchical architecture where a high-order reasoning agent decomposes complex tasks and coordinates expert agents responsible for personal context retrieval, tool interaction, and dynamic plan generation. By implementing parameter sharing across agents and leveraging prompt compression, we significantly reduce model size, latency, and memory usage. To validate our approach, we present a novel dataset capturing multi-agent task trajectories centered on personalized mobile assistant use-cases. Our experiments reveal that fine-tuned SLM agents not only surpass closed-source LLMs in task completion F1 by~35\% but also eliminate the need for server-device communication, all while enhancing privacy.
