Agentic Reasoning and Refinement through Semantic Interaction
Xuxin Tang, Rehema Abulikemu, Eric Krokos, Kirsten Whitley, Xuan Wang, Chris North
TL;DR
The paper tackles the challenge of incorporating sequential semantic interactions into sensemaking report refinement by introducing VIS-ReAct, a two-agent LLM framework. A primary LLM analysis agent ingests newly-added semantic interactions to infer user intent and generate a contextual refinement plan, which a secondary LLM refinement agent then uses to update the report, guided by a four-step workflow that converts the workspace to text, extracts interactions, analyzes intent, and refines content. Quantitative and qualitative results show that VIS-ReAct achieves superior targeted refinement, semantic fidelity, and transparent inference compared with baselines, across diverse interaction types and granularities. The work advances human–AI collaboration in sensemaking by making the refinement process more interpretable and controllable, with potential applications beyond reporting to broader interactive writing and decision-support tasks.
Abstract
Sensemaking report writing often requires multiple refinements in the iterative process. While Large Language Models (LLMs) have shown promise in generating initial reports based on human visual workspace representations, they struggle to precisely incorporate sequential semantic interactions during the refinement process. We introduce VIS-ReAct, a framework that reasons about newly-added semantic interactions in visual workspaces to steer the LLM for report refinement. VIS-ReAct is a two-agent framework: a primary LLM analysis agent interprets new semantic interactions to infer user intentions and generate refinement planning, followed by an LLM refinement agent that updates reports accordingly. Through case study, VIS-ReAct outperforms baseline and VIS-ReAct (without LLM analysis) on targeted refinement, semantic fidelity, and transparent inference. Results demonstrate that VIS-ReAct better handles various interaction types and granularities while enhancing the transparency of human-LLM collaboration.
