"What Are You Doing?": Effects of Intermediate Feedback from Agentic LLM In-Car Assistants During Multi-Step Processing
Johannes Kirmayr, Raphael Wennmacher, Khanh Huynh, Lukas Stappen, Elisabeth André, Florian Alt
TL;DR
The paper tackles how agentic LLM-based in-car assistants should communicate during long-running, multi-step tasks. It uses a controlled mixed-methods design (N=45) comparing Planning & Results (PR) intermediate feedback to No Intermediate (NI) final-only delivery, across stationary and driving contexts. Quantitative results show PR enhances perceived speed, user experience, and trust, while reducing task load, with effect sizes such as $d_z=1.01$ for perceived speed and $d_z=0.38$ for trust; qualitative data reveal adaptive verbosity strategies—high initial transparency followed by reductions as reliability grows, with situational re-expansion for novel or high-stakes tasks. The findings translate into design implications for feedback timing and content, advocating: (i) frequent, content-rich intermediate updates during long computations; (ii) adaptive verbosity gated by demonstrated reliability; and (iii) situational controls to balance transparency and distraction in dual-task driving and other primary-task contexts.
Abstract
Agentic AI assistants that autonomously perform multi-step tasks raise open questions for user experience: how should such systems communicate progress and reasoning during extended operations, especially in attention-critical contexts such as driving? We investigate feedback timing and verbosity from agentic LLM-based in-car assistants through a controlled, mixed-methods study (N=45) comparing planned steps and intermediate results feedback against silent operation with final-only response. Using a dual-task paradigm with an in-car voice assistant, we found that intermediate feedback significantly improved perceived speed, trust, and user experience while reducing task load - effects that held across varying task complexities and interaction contexts. Interviews further revealed user preferences for an adaptive approach: high initial transparency to establish trust, followed by progressively reducing verbosity as systems prove reliable, with adjustments based on task stakes and situational context. We translate our empirical findings into design implications for feedback timing and verbosity in agentic assistants, balancing transparency and efficiency.
