Table of Contents
Fetching ...

UFO$^3$: Weaving the Digital Agent Galaxy

Chaoyun Zhang, Liqun Li, He Huang, Chiming Ni, Bo Qiao, Si Qin, Yu Kang, Minghua Ma, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang

TL;DR

UFO$^3$ addresses the fragmentation of intelligent agents by unifying heterogeneous devices into a single cross-device orchestration fabric. It introduces TaskConstellations, a mutable DAG-based representation that decomposes user intents into TaskStars and TaskStarLines, enabling asynchronous, adaptive execution across desktops, servers, mobile, and edge devices. A central ConstellationAgent performs LLM-driven planning and replanning, while the Constellation Orchestrator executes tasks with safe locking, batched edits, and strong invariants, all coordinated via the Agent Interaction Protocol (AIP). The framework is instantiated with device-agent templates (LinuxAgent, WindowsAgent, Android) and evaluated on NebulaBench, demonstrating substantial parallelism (average width ~1.72), robust fault handling, and a 31% end-to-end latency reduction compared to sequential baselines, illustrating a scalable path toward a shared, cross-device memory and memory-enabled collaboration across a growing ecosystem of agents. The work highlights a practical route to a Digital Agent Galaxy where reasoning, execution, and memory can be coordinated across diverse platforms, enabling reliable, intelligent cross-device automation at scale.

Abstract

Large language model (LLM)-powered agents are transforming digital devices from passive tools into proactive intelligent collaborators. However, most existing frameworks remain confined to a single OS or device, making cross-device workflows brittle and largely manual. We present UFO$^3$, a system that unifies heterogeneous endpoints, desktops, servers, mobile devices, and edge, into a single orchestration fabric. UFO$^3$ models each user request as a mutable TaskConstellation: a distributed DAG of atomic subtasks (TaskStars) with explicit control and data dependencies (TaskStarLines). The TaskConstellation continuously evolves as results stream in from distributed devices, enabling asynchronous execution, adaptive recovery, and dynamic optimization. A Constellation Orchestrator} executes tasks safely and asynchronously while applying dynamic DAG updates, and the Agent Interaction Protocol (AIP) provides persistent, low-latency channels for reliable task dispatch and result streaming. These designs dissolve the traditional boundaries between devices and platforms, allowing agents to collaborate seamlessly and amplify their collective intelligence. We evaluate UFO$^3$ on NebulaBench, a benchmark of 55 cross-device tasks across 5 machines and 10 categories. UFO$^3$ achieves 83.3% subtask completion, 70.9% task success, exposes parallelism with an average width of 1.72, and reduces end-to-end latency by 31% relative to a sequential baseline. Fault-injection experiments demonstrate graceful degradation and recovery under transient and permanent agent failures. These results show that UFO$^3$ achieves accurate, efficient, and resilient task orchestration across heterogeneous devices, uniting isolated agents into a coherent, adaptive computing fabric that extends across the landscape of ubiquitous computing.

UFO$^3$: Weaving the Digital Agent Galaxy

TL;DR

UFO addresses the fragmentation of intelligent agents by unifying heterogeneous devices into a single cross-device orchestration fabric. It introduces TaskConstellations, a mutable DAG-based representation that decomposes user intents into TaskStars and TaskStarLines, enabling asynchronous, adaptive execution across desktops, servers, mobile, and edge devices. A central ConstellationAgent performs LLM-driven planning and replanning, while the Constellation Orchestrator executes tasks with safe locking, batched edits, and strong invariants, all coordinated via the Agent Interaction Protocol (AIP). The framework is instantiated with device-agent templates (LinuxAgent, WindowsAgent, Android) and evaluated on NebulaBench, demonstrating substantial parallelism (average width ~1.72), robust fault handling, and a 31% end-to-end latency reduction compared to sequential baselines, illustrating a scalable path toward a shared, cross-device memory and memory-enabled collaboration across a growing ecosystem of agents. The work highlights a practical route to a Digital Agent Galaxy where reasoning, execution, and memory can be coordinated across diverse platforms, enabling reliable, intelligent cross-device automation at scale.

Abstract

Large language model (LLM)-powered agents are transforming digital devices from passive tools into proactive intelligent collaborators. However, most existing frameworks remain confined to a single OS or device, making cross-device workflows brittle and largely manual. We present UFO, a system that unifies heterogeneous endpoints, desktops, servers, mobile devices, and edge, into a single orchestration fabric. UFO models each user request as a mutable TaskConstellation: a distributed DAG of atomic subtasks (TaskStars) with explicit control and data dependencies (TaskStarLines). The TaskConstellation continuously evolves as results stream in from distributed devices, enabling asynchronous execution, adaptive recovery, and dynamic optimization. A Constellation Orchestrator} executes tasks safely and asynchronously while applying dynamic DAG updates, and the Agent Interaction Protocol (AIP) provides persistent, low-latency channels for reliable task dispatch and result streaming. These designs dissolve the traditional boundaries between devices and platforms, allowing agents to collaborate seamlessly and amplify their collective intelligence. We evaluate UFO on NebulaBench, a benchmark of 55 cross-device tasks across 5 machines and 10 categories. UFO achieves 83.3% subtask completion, 70.9% task success, exposes parallelism with an average width of 1.72, and reduces end-to-end latency by 31% relative to a sequential baseline. Fault-injection experiments demonstrate graceful degradation and recovery under transient and permanent agent failures. These results show that UFO achieves accurate, efficient, and resilient task orchestration across heterogeneous devices, uniting isolated agents into a coherent, adaptive computing fabric that extends across the landscape of ubiquitous computing.

Paper Structure

This paper contains 102 sections, 5 equations, 26 figures, 5 tables, 1 algorithm.

Figures (26)

  • Figure 1: UFO$^3$: Weaving the Digital Agent Galaxy. A single natural-language intent is decomposed into a dynamically evolved Constellation (DAG) executed across heterogeneous devices. Demo video available at: https://www.youtube.com/watch?v=NGrVWGcJL8o.
  • Figure 2: Layered architecture of UFO$^3$.
  • Figure 3: Example of a TaskConstellation illustrating both sequential and parallel dependencies.
  • Figure 4: An overview of the ConstellationAgent.
  • Figure 5: Lifecycle state transitions of the ConstellationAgent.
  • ...and 21 more figures