Table of Contents
Fetching ...

Understanding How Blind Users Handle Object Recognition Errors: Strategies and Challenges

Jonggi Hong, Hernisa Kacorri

TL;DR

This work investigates how blind and low-vision users handle errors in camera-based object recognition. Using URCam, a fine-tuned teachable recognizer, the study combines remote interviews and a hands-on error-identification task with 12 participants to reveal user strategies, confidence, and time when dealing with misrecognitions. Key findings show participants independently review photo quality, frequently rely on contextual cues, and identify roughly half of the errors, with error detection not reliably improved by repetition. The results offer design implications for accessible interfaces that support error awareness and mitigation in object recognition systems, aiming to close the gap between benchmark accuracy and real-world usability.

Abstract

Object recognition technologies hold the potential to support blind and low-vision people in navigating the world around them. However, the gap between benchmark performances and practical usability remains a significant challenge. This paper presents a study aimed at understanding blind users' interaction with object recognition systems for identifying and avoiding errors. Leveraging a pre-existing object recognition system, URCam, fine-tuned for our experiment, we conducted a user study involving 12 blind and low-vision participants. Through in-depth interviews and hands-on error identification tasks, we gained insights into users' experiences, challenges, and strategies for identifying errors in camera-based assistive technologies and object recognition systems. During interviews, many participants preferred independent error review, while expressing apprehension toward misrecognitions. In the error identification task, participants varied viewpoints, backgrounds, and object sizes in their images to avoid and overcome errors. Even after repeating the task, participants identified only half of the errors, and the proportion of errors identified did not significantly differ from their first attempts. Based on these insights, we offer implications for designing accessible interfaces tailored to the needs of blind and low-vision users in identifying object recognition errors.

Understanding How Blind Users Handle Object Recognition Errors: Strategies and Challenges

TL;DR

This work investigates how blind and low-vision users handle errors in camera-based object recognition. Using URCam, a fine-tuned teachable recognizer, the study combines remote interviews and a hands-on error-identification task with 12 participants to reveal user strategies, confidence, and time when dealing with misrecognitions. Key findings show participants independently review photo quality, frequently rely on contextual cues, and identify roughly half of the errors, with error detection not reliably improved by repetition. The results offer design implications for accessible interfaces that support error awareness and mitigation in object recognition systems, aiming to close the gap between benchmark accuracy and real-world usability.

Abstract

Object recognition technologies hold the potential to support blind and low-vision people in navigating the world around them. However, the gap between benchmark performances and practical usability remains a significant challenge. This paper presents a study aimed at understanding blind users' interaction with object recognition systems for identifying and avoiding errors. Leveraging a pre-existing object recognition system, URCam, fine-tuned for our experiment, we conducted a user study involving 12 blind and low-vision participants. Through in-depth interviews and hands-on error identification tasks, we gained insights into users' experiences, challenges, and strategies for identifying errors in camera-based assistive technologies and object recognition systems. During interviews, many participants preferred independent error review, while expressing apprehension toward misrecognitions. In the error identification task, participants varied viewpoints, backgrounds, and object sizes in their images to avoid and overcome errors. Even after repeating the task, participants identified only half of the errors, and the proportion of errors identified did not significantly differ from their first attempts. Based on these insights, we offer implications for designing accessible interfaces tailored to the needs of blind and low-vision users in identifying object recognition errors.
Paper Structure (34 sections, 8 figures, 3 tables)

This paper contains 34 sections, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Object stimuli in our study from Kacorri et al.kacorri2017people: baking soda, caramel coffee, Cheetos, chewy bars, chicken broth, coca-cola, diced tomatoes, diet coke, dill, Fritos, Lacroix apricot, Lacroix mango, Lays, oregano, roast coffee.
  • Figure 2: A series of screenshots from URCam that was deployed in the study, where participants P1 and P2 experienced correct, incorrect, and uncertain predictions communicated via a "Don't know" message.
  • Figure 3: Participants' experience in taking photos.
  • Figure 4: Camera-based assistive apps the participants have used regularly.
  • Figure 5: Participants' responses about frequency of encountered errors and verification of the output from the apps.
  • ...and 3 more figures