Table of Contents
Fetching ...

Leveraging Cognitive States for Adaptive Scaffolding of Understanding in Explanatory Tasks in HRI

André Groß, Birte Richter, Bjarne Thomzik, Britta Wrede

TL;DR

This paper investigates adaptive verbal scaffolding in human-robot interaction using the SHIFT model to tailor explanations to the human's cognitive state. SHIFT uses gaze, performance, and history to select strategies like negation and hesitation, comparing adaptive scaffolding to a baseline of affirmations. The results show that adaptive scaffolding tends to increase processing costs but reduces errors by about 23% overall, with significant improvements in several cognitive states and task contexts, though some states and tasks show limited or negative benefits. These findings highlight the potential and limitations of state-aware explanations for personalizing robot explainability and guide future refinements in task complexity, content adaptation, and learning-based strategy selection.

Abstract

Understanding how scaffolding strategies influence human understanding in human-robot interaction is important for developing effective assistive systems. This empirical study investigates linguistic scaffolding strategies based on negation as an important means that de-biases the user from potential errors but increases processing costs and hesitations as a means to ameliorate processing costs. In an adaptive strategy, the user state with respect to the current state of understanding and processing capacity was estimated via a scoring scheme based on task performance, prior scaffolding strategy, and current eye gaze behavior. In the study, the adaptive strategy of providing negations and hesitations was compared with a non-adaptive strategy of providing only affirmations. The adaptive scaffolding strategy was generated using the computational model SHIFT. Our findings indicate that using adaptive scaffolding strategies with SHIFT tends to (1) increased processing costs, as reflected in longer reaction times, but (2) improved task understanding, evidenced by a lower error rate of almost 23%. We assessed the efficiency of SHIFT's selected scaffolding strategies across different cognitive states, finding that in three out of five states, the error rate was lower compared to the baseline condition. We discuss how these results align with the assumptions of the SHIFT model and highlight areas for refinement. Moreover, we demonstrate how scaffolding strategies, such as negation and hesitation, contribute to more effective human-robot explanatory dialogues.

Leveraging Cognitive States for Adaptive Scaffolding of Understanding in Explanatory Tasks in HRI

TL;DR

This paper investigates adaptive verbal scaffolding in human-robot interaction using the SHIFT model to tailor explanations to the human's cognitive state. SHIFT uses gaze, performance, and history to select strategies like negation and hesitation, comparing adaptive scaffolding to a baseline of affirmations. The results show that adaptive scaffolding tends to increase processing costs but reduces errors by about 23% overall, with significant improvements in several cognitive states and task contexts, though some states and tasks show limited or negative benefits. These findings highlight the potential and limitations of state-aware explanations for personalizing robot explainability and guide future refinements in task complexity, content adaptation, and learning-based strategy selection.

Abstract

Understanding how scaffolding strategies influence human understanding in human-robot interaction is important for developing effective assistive systems. This empirical study investigates linguistic scaffolding strategies based on negation as an important means that de-biases the user from potential errors but increases processing costs and hesitations as a means to ameliorate processing costs. In an adaptive strategy, the user state with respect to the current state of understanding and processing capacity was estimated via a scoring scheme based on task performance, prior scaffolding strategy, and current eye gaze behavior. In the study, the adaptive strategy of providing negations and hesitations was compared with a non-adaptive strategy of providing only affirmations. The adaptive scaffolding strategy was generated using the computational model SHIFT. Our findings indicate that using adaptive scaffolding strategies with SHIFT tends to (1) increased processing costs, as reflected in longer reaction times, but (2) improved task understanding, evidenced by a lower error rate of almost 23%. We assessed the efficiency of SHIFT's selected scaffolding strategies across different cognitive states, finding that in three out of five states, the error rate was lower compared to the baseline condition. We discuss how these results align with the assumptions of the SHIFT model and highlight areas for refinement. Moreover, we demonstrate how scaffolding strategies, such as negation and hesitation, contribute to more effective human-robot explanatory dialogues.

Paper Structure

This paper contains 19 sections, 6 figures, 2 tables.

Figures (6)

  • Figure 1: hri study design: The NAO robot provides verbal instructions to guide humans in completing tasks on a touchscreen. The explanation generation is based on human monitoring.
  • Figure 2: Visualization of tasks with three target stimuli: the main object to be manipulated in the center and two objects (tool 1, tool 2) visible in the upper corners and associated with corresponding actions (action 1, action 2) from \ref{['tab:tasks']}.
  • Figure 3: Overview of the task sequence, including: (1) the overarching goal of medication preparation, (2) verbal presentation of the patient's medical history and action instructions, (3-4) selection of the appropriate tool, (5) initiation of the interaction task and execution of the correct gesture, and (6) verbal feedback on task completion. Reaction times for both subtasks are measured at points (3) and (5), based on the first interaction with the touchscreen.
  • Figure 4: Time in seconds until the first interaction with the touchscreen averaged for selection and interaction task as processing costs. Processing costs for experiment running with SHIFT and with affirmations (BL).
  • Figure 5: Evaluation of the task understanding by task failure rate. Left: Comparison of task failures as error-rates for SHIFT and baseline. In baseline condition, the verbal instruction is always an affirmation. With the use of SHIFT, the strategies are selected by the observation of the human cognitive state. Right: The patient (iterations) describes the number of repetition in a task, each task is repeated 4 times. Visualization of the changes in the total task performance failure sum over time. The numbers at the bottom indicate the correct tool to be selected for task completion, serving as the target of discourse. The order of these targets follows specific patterns, including alternating (2, 1, 2, 1), paired (1, 1, 2, 2), hugging (1, 2, 2, 1), biased (1, 1, 1, 2), and converging (2, 2, 1, 2) arrangements. Each pattern varies in how the tool selections are distributed and repeated across iterations.
  • ...and 1 more figures