Table of Contents
Fetching ...

Less or More: Towards Glanceable Explanations for LLM Recommendations Using Ultra-Small Devices

Xinru Wang, Mengjie Yu, Hannah Nguyen, Michael Iuzzolino, Tianyi Wang, Peiqi Tang, Natasha Lynova, Co Tran, Ting Zhang, Naveen Sendhilnathan, Hrvoje Benko, Haijun Xia, Tanya Jonker

TL;DR

The paper tackles the challenge of delivering glanceable LLM explanations for action recommendations on ultra-small devices by proposing spatially structured explanations and temporally adaptive presentation based on confidence. It introduces a Socratic Models–based pipeline that converts verbose explanations into four contextual components (activity, object, location, goal) and estimates a calibrated $c_{hybrid}$ to control adaptive display. A large user study with 44 participants shows that always-on structured explanations reduce reading time and cognitive load and increase acceptance, but can reduce perceived detail and naturalness; adaptive explanations add user control but may hinder interaction due to extra toggling. The study yields design implications for content selection, timing, and personalization, highlighting a trade-off between glanceability and explanation depth. Overall, the work provides a foundation for designing responsible, glanceable AI explanations on ultra-small devices and points to future work in multimodal, personalized, and real-time explanations.

Abstract

Large Language Models (LLMs) have shown remarkable potential in recommending everyday actions as personal AI assistants, while Explainable AI (XAI) techniques are being increasingly utilized to help users understand why a recommendation is given. Personal AI assistants today are often located on ultra-small devices such as smartwatches, which have limited screen space. The verbosity of LLM-generated explanations, however, makes it challenging to deliver glanceable LLM explanations on such ultra-small devices. To address this, we explored 1) spatially structuring an LLM's explanation text using defined contextual components during prompting and 2) presenting temporally adaptive explanations to users based on confidence levels. We conducted a user study to understand how these approaches impacted user experiences when interacting with LLM recommendations and explanations on ultra-small devices. The results showed that structured explanations reduced users' time to action and cognitive load when reading an explanation. Always-on structured explanations increased users' acceptance of AI recommendations. However, users were less satisfied with structured explanations compared to unstructured ones due to their lack of sufficient, readable details. Additionally, adaptively presenting structured explanations was less effective at improving user perceptions of the AI compared to the always-on structured explanations. Together with users' interview feedback, the results led to design implications to be mindful of when personalizing the content and timing of LLM explanations that are displayed on ultra-small devices.

Less or More: Towards Glanceable Explanations for LLM Recommendations Using Ultra-Small Devices

TL;DR

The paper tackles the challenge of delivering glanceable LLM explanations for action recommendations on ultra-small devices by proposing spatially structured explanations and temporally adaptive presentation based on confidence. It introduces a Socratic Models–based pipeline that converts verbose explanations into four contextual components (activity, object, location, goal) and estimates a calibrated to control adaptive display. A large user study with 44 participants shows that always-on structured explanations reduce reading time and cognitive load and increase acceptance, but can reduce perceived detail and naturalness; adaptive explanations add user control but may hinder interaction due to extra toggling. The study yields design implications for content selection, timing, and personalization, highlighting a trade-off between glanceability and explanation depth. Overall, the work provides a foundation for designing responsible, glanceable AI explanations on ultra-small devices and points to future work in multimodal, personalized, and real-time explanations.

Abstract

Large Language Models (LLMs) have shown remarkable potential in recommending everyday actions as personal AI assistants, while Explainable AI (XAI) techniques are being increasingly utilized to help users understand why a recommendation is given. Personal AI assistants today are often located on ultra-small devices such as smartwatches, which have limited screen space. The verbosity of LLM-generated explanations, however, makes it challenging to deliver glanceable LLM explanations on such ultra-small devices. To address this, we explored 1) spatially structuring an LLM's explanation text using defined contextual components during prompting and 2) presenting temporally adaptive explanations to users based on confidence levels. We conducted a user study to understand how these approaches impacted user experiences when interacting with LLM recommendations and explanations on ultra-small devices. The results showed that structured explanations reduced users' time to action and cognitive load when reading an explanation. Always-on structured explanations increased users' acceptance of AI recommendations. However, users were less satisfied with structured explanations compared to unstructured ones due to their lack of sufficient, readable details. Additionally, adaptively presenting structured explanations was less effective at improving user perceptions of the AI compared to the always-on structured explanations. Together with users' interview feedback, the results led to design implications to be mindful of when personalizing the content and timing of LLM explanations that are displayed on ultra-small devices.

Paper Structure

This paper contains 44 sections, 7 figures, 1 table.

Figures (7)

  • Figure 1: Our LLM pipeline, which was based on Socratic Models, used pre-trained vision-language models to generate a linguistic summary of a video input (detected objects and user physical actions) for downstream processing with an LLM (GPT-4). To obtain the LLM self-explanations, the LLM first summarized all possible contexts (i.e., the [activity] the user is doing, the [object] the user is interacting with, and the [location] the user is in) and then inferred the short-term [goal] that the user may want to achieve. Based on the inferred goal, the LLM then provided a digital action recommendation. Throughout the process, we calculated confidence levels of the output from pre-trained vision-language models. The LLM was then prompted for confidence levels for each contextual component (i.e., [activity], [object], [location], and [goal]) and the recommendation.
  • Figure 2: The prompt template included an input description, explanation instructions with few-shot in-context examples (blue text in italics), output formatting instructions, and the target input (pink text in italics, changed for every query).
  • Figure 3: The interface that participants saw on the desktop computer. On the left, they could watch the 30-second Ego4D video. They could chose whether to accept or dismiss the AI’s recommendation on the smartwatch UI on the right.
  • Figure 4: The Smartwatch UI designs for the four conditions in the user study. (d)-(f) illustrate different variations of the adaptive structured explanation condition.
  • Figure 5: Participants’ time to action and acceptance rate for each AI explanation condition. The error bars represent standard errors. (*: $p$ < 0.05; **: $p$ < 0.01; ***: $p$ < 0.001)
  • ...and 2 more figures