Learning to Wait: Synchronizing Agents with the Physical World
Yifei She, Ping Zhang, He Liu, Yanmin Jia, Yang Jing, Zijun Liu, Peng Sun, Xiangbin Li, Xiaohe Hu
TL;DR
The paper tackles the Temporal Gap between agent actions and delayed feedback in real-world, asynchronous environments. It argues that an Agent-side approach—extending Code-as-Action to the temporal domain and leveraging semantic priors with In-Context Learning—can predict precise waiting durations and synchronize internal clocks with the physical world, avoiding costly environment-side polling. Validation occurs in a simulated Kubernetes-like setting with Gamma-distributed latencies, where Inter-Episode History Feedback enables the LLMs to progressively calibrate their timing. Results show that LLMs can learn to align their Cognitive Timeline with environmental latency, with model-dependent dynamics, suggesting this temporal awareness is a learnable capability essential for autonomous, self-evolving agents in open-ended environments.
Abstract
Real-world agentic tasks, unlike synchronous Markov Decision Processes (MDPs), often involve non-blocking actions with variable latencies, creating a fundamental \textit{Temporal Gap} between action initiation and completion. Existing environment-side solutions, such as blocking wrappers or frequent polling, either limit scalability or dilute the agent's context window with redundant observations. In this work, we propose an \textbf{Agent-side Approach} that empowers Large Language Models (LLMs) to actively align their \textit{Cognitive Timeline} with the physical world. By extending the Code-as-Action paradigm to the temporal domain, agents utilize semantic priors and In-Context Learning (ICL) to predict precise waiting durations (\texttt{time.sleep(t)}), effectively synchronizing with asynchronous environment without exhaustive checking. Experiments in a simulated Kubernetes cluster demonstrate that agents can precisely calibrate their internal clocks to minimize both query overhead and execution latency, validating that temporal awareness is a learnable capability essential for autonomous evolution in open-ended environments.
