The Midas Touch in Gaze vs. Hand Pointing: Modality-Specific Failure Modes and Implications for XR Interfaces

Mohammad Dastgheib; Fatemeh Pourmahdian

The Midas Touch in Gaze vs. Hand Pointing: Modality-Specific Failure Modes and Implications for XR Interfaces

Mohammad Dastgheib, Fatemeh Pourmahdian

Abstract

Extended Reality (XR) interfaces impose both ergonomic and cognitive demands, yet current systems often force a binary choice between hand-based input, which can produce fatigue, and gaze-based input, which is vulnerable to the Midas Touch problem and precision limitations. We introduce the xr-adaptive-modality-2025 platform, a web-based open-source framework for studying whether modality-specific adaptive interventions can improve XR-relevant pointing performance and reduce workload relative to static unimodal interaction. The platform combines physiologically informed gaze simulation, an ISO 9241-9 multidirectional tapping task, and two modality-specific adaptive interventions: gaze declutter and hand target-width inflation. We evaluated the system in a 2 x 2 x 2 within-subjects design manipulating Modality (Hand vs. Gaze), UI Mode (Static vs. Adaptive), and Pressure (Yes vs. No). Results from N=69 participants show that hand yielded higher throughput than gaze (5.17 vs. 4.73 bits/s), lower error (1.8% vs. 19.1%), and lower NASA-TLX workload. Crucially, error profiles differed sharply by modality: gaze errors were predominantly slips (99.2%), whereas hand errors were predominantly misses (95.7%), consistent with the Midas Touch account. Of the two adaptive interventions, only gaze declutter executed in this dataset; it modestly reduced timeouts but not slips. Hand width inflation was not evaluable due to a UI integration bug. These findings reveal modality-specific failure modes with direct implications for adaptive policy design, and establish the platform as a reproducible infrastructure for future studies.

The Midas Touch in Gaze vs. Hand Pointing: Modality-Specific Failure Modes and Implications for XR Interfaces

Abstract

Paper Structure (39 sections, 10 figures)

This paper contains 39 sections, 10 figures.

Introduction
Background and Related Work
Sensorimotor Implications of Hand and Gaze Input
Signal Processing and Cognitive Load Framing
Adaptive Intervention Mechanisms
Differentiation from Prior Gaze+Hand Work
Methods
Apparatus
Display Calibration and Reliability Measures
Gaze Simulation
Input Modality Implementations
Task and Stimuli
Experimental Design
Counterbalancing: The Williams Design
Participants
...and 24 more sections

Figures (10)

Figure 1: Psychophysically-grounded gaze proxy pipeline. Panel A (left): The three-stage simulation: raw mouse input is differentiated to estimate angular velocity; velocity above 120 deg/s triggers saccadic suppression (cursor freezes, then jumps to new position); velocity below 30 deg/s triggers the fixation transform (Gaussian jitter $\sigma \approx$ 0.5 mm + first-order lag smoothing). Panel B (center): Output signal examples---fixation mode (B1) produces continuous spatial noise consistent with microsaccade statistics; saccade mode (B2) produces a cursor freeze and ballistic jump, reproducing the perceptual blind phase of real saccades. Panel C (right): The selection model, with a dwell-confirm tolerance ring (target radius + 10 px) that accommodates fixation jitter while keeping sensitivity well-defined.
Figure 2: Task UI overview. ISO 9241-9 multi-directional tapping task: participants select highlighted targets on a central canvas. A side HUD shows modality and block-level feedback.
Figure 3: Primary performance by modality. (A) Throughput (bits/s): hand produced higher throughput than gaze. (B) Error rate (%): gaze showed substantially higher error rate than hand. Error bars show 95% CI.
Figure 4: Error type composition by modality. Gaze errors are predominantly slips (accidental activations); hand errors are predominantly misses.
Figure 5: NASA-TLX overall workload (0--100) by modality. Gaze imposed higher subjective workload than hand. Error bars show 95% CI.
...and 5 more figures

The Midas Touch in Gaze vs. Hand Pointing: Modality-Specific Failure Modes and Implications for XR Interfaces

Abstract

The Midas Touch in Gaze vs. Hand Pointing: Modality-Specific Failure Modes and Implications for XR Interfaces

Authors

Abstract

Table of Contents

Figures (10)