Table of Contents
Fetching ...

Multimodal Feedback for Task Guidance in Augmented Reality

Hu Guo, Lily Patel, Rohan Gupt

TL;DR

This work investigates augmenting OST-AR guidance with wrist-based vibrotactile haptics to address depth-perception challenges in handheld tool tasks. It introduces a six-motor wristband, integrates it with an OST-AR pipeline, and conducts two pre-registered experiments (N=21 and N=27) showing that multimodal feedback improves depth accuracy and usability over visual-only or haptic-only conditions, albeit with a modest increase in task duration. The results yield design guidelines favoring pull-style, brief haptic signals and suggest multimodal guidance to mitigate occlusion and lighting limitations in AR, with implications for education, simulation, and cross-modal interfaces. Overall, wrist-based haptics can supplement OST-AR to enhance precision, reduce visual workload, and support high-precision manual tasks in AR environments.

Abstract

Optical see-through augmented reality (OST-AR) overlays digital targets and annotations on the physical world, offering promising guidance for hands-on tasks such as medical needle insertion or assembly. Recent work on OST-AR depth perception shows that target opacity and tool visualization significantly affect accuracy and usability; opaque targets and rendering the real instrument reduce depth errors, whereas transparent targets and absent tools impair performance. However, reliance on visual overlays may overload attention and leaves little room for depth cues when occlusion or lighting hampers perception. To address these limitations, we explore multimodal feedback that combines OST-AR with wrist-based vibrotactile haptics. The past two years have seen rapid advances in haptic technology. Researchers have investigated skin-stretch and vibrotactile cues for conveying spatial information to blind users, wearable ring actuators that support precise pinching in AR, cross-modal audio-haptic cursors that enable eyes-free object selection, and wrist-worn feedback for teleoperated surgery that improves force awareness at the cost of longer task times. Studies comparing pull versus push vibrotactile metaphors found that pull cues yield faster gesture completion and lower cognitive load. These findings motivate revisiting OST-AR guidance with a fresh perspective on wrist-based haptics. We design a custom wristband with six vibromotors delivering directional and state cues, integrate it with a handheld tool and OST-AR, and assess its impact on cue recognition and depth guidance. Through a formative study and two experiments (N=21 and N=27), we show that participants accurately identify haptic patterns under cognitive load and that multimodal feedback improves spatial precision and usability compared with visual-only or haptic-only conditions.

Multimodal Feedback for Task Guidance in Augmented Reality

TL;DR

This work investigates augmenting OST-AR guidance with wrist-based vibrotactile haptics to address depth-perception challenges in handheld tool tasks. It introduces a six-motor wristband, integrates it with an OST-AR pipeline, and conducts two pre-registered experiments (N=21 and N=27) showing that multimodal feedback improves depth accuracy and usability over visual-only or haptic-only conditions, albeit with a modest increase in task duration. The results yield design guidelines favoring pull-style, brief haptic signals and suggest multimodal guidance to mitigate occlusion and lighting limitations in AR, with implications for education, simulation, and cross-modal interfaces. Overall, wrist-based haptics can supplement OST-AR to enhance precision, reduce visual workload, and support high-precision manual tasks in AR environments.

Abstract

Optical see-through augmented reality (OST-AR) overlays digital targets and annotations on the physical world, offering promising guidance for hands-on tasks such as medical needle insertion or assembly. Recent work on OST-AR depth perception shows that target opacity and tool visualization significantly affect accuracy and usability; opaque targets and rendering the real instrument reduce depth errors, whereas transparent targets and absent tools impair performance. However, reliance on visual overlays may overload attention and leaves little room for depth cues when occlusion or lighting hampers perception. To address these limitations, we explore multimodal feedback that combines OST-AR with wrist-based vibrotactile haptics. The past two years have seen rapid advances in haptic technology. Researchers have investigated skin-stretch and vibrotactile cues for conveying spatial information to blind users, wearable ring actuators that support precise pinching in AR, cross-modal audio-haptic cursors that enable eyes-free object selection, and wrist-worn feedback for teleoperated surgery that improves force awareness at the cost of longer task times. Studies comparing pull versus push vibrotactile metaphors found that pull cues yield faster gesture completion and lower cognitive load. These findings motivate revisiting OST-AR guidance with a fresh perspective on wrist-based haptics. We design a custom wristband with six vibromotors delivering directional and state cues, integrate it with a handheld tool and OST-AR, and assess its impact on cue recognition and depth guidance. Through a formative study and two experiments (N=21 and N=27), we show that participants accurately identify haptic patterns under cognitive load and that multimodal feedback improves spatial precision and usability compared with visual-only or haptic-only conditions.

Paper Structure

This paper contains 32 sections, 6 figures.

Figures (6)

  • Figure 1: Cue identification accuracy by type (mean $\pm$ SD). Vertical cues underperform horizontal ones; the state cue is near ceiling Weber2008VibrotactileGuidanceSmith2023PullPush.
  • Figure 2: Response time by cue type (means). No reliable differences across cue classes; participants respond uniformly after learning the mapping.
  • Figure 3: Alignment error by condition (mean $\pm$ SD). Multimodal significantly reduces depth error vs. single--modality baselines Yang2020DepthPerception.
  • Figure 4: Completion time by condition. Multimodal introduces a small time premium consistent with increased confirmation checks Wang2025Teleoperation.
  • Figure 5: System Usability Scale (SUS) by condition. Multimodal is rated most usable.
  • ...and 1 more figures