GazeNoter: Co-Piloted AR Note-Taking via Gaze Selection of LLM Suggestions to Match Users' Intentions
Hsin-Ruey Tsai, Shih-Kang Chiu, Bryan Wang
TL;DR
GazeNoter presents an AI-copiloted AR note-taking system that uses gaze selection to choose LLM-generated suggestions, enabling both within-context and beyond-context notes with minimal distraction. By integrating LLM-driven extraction, derivation, and organization, and coupling them with an eye-tracking ring input on an AR headset, the system supports real-time, low-load note-taking during speeches and walking meetings. Across two user studies, GazeNoter outperformed manual typing and auto-generated notes in quantity, quality, and usability, with AR providing advantages in distraction and social acceptance. The work demonstrates a practical path for real-time, user-in-the-loop AI in XR, offering scalable benefits for meeting capture, Q&A preparation, and on-the-fly idea capture.
Abstract
Note-taking is critical during speeches and discussions, serving not only for later summarization and organization but also for real-time question and opinion reminding in question-and-answer sessions or timely contributions in discussions. Manually typing on smartphones for note-taking could be distracting and increase cognitive load for users. While large language models (LLMs) are used to automatically generate summaries and highlights, the content generated by artificial intelligence (AI) may not match users' intentions without user input or interaction. Therefore, we propose an AI-copiloted augmented reality (AR) system, GazeNoter, to allow users to swiftly select diverse LLM-generated suggestions via gaze on an AR headset for real-time note-taking. GazeNoter leverages an AR headset as a medium for users to swiftly adjust the LLM output to match their intentions, forming a user-in-the-loop AI system for both within-context and beyond-context notes. We conducted two user studies to verify the usability of GazeNoter in attending speeches in a static sitting condition and walking meetings and discussions in a mobile walking condition, respectively.
