Table of Contents
Fetching ...

FetchAid: Making Parcel Lockers More Accessible to Blind and Low Vision People With Deep-learning Enhanced Touchscreen Guidance, Error-Recovery Mechanism, and AR-based Search Support

Zhitong Guan, Zeyu Xiong, Mingming Fan

TL;DR

FetchAid tackles BLV access barriers in parcel locker use by integrating on-device object detection for touchscreen guidance, OCR-based door localization, and AR-based navigation to the open compartment. The two-phase system delivers real-time, context-aware voice feedback and safety cues, supported by an error-recovery mechanism that helps users recover from incorrect screen actions. Technical evaluations show strong object-detection robustness with data augmentation, high OCR accuracy across angles, and reliable AR navigation, while a user study with 12 BLV participants demonstrates significant improvements in task success and reductions in perceived workload. The work advances accessible interaction design for public touchscreen devices and suggests pathways for generalizing to other machines and open-source development. It provides a practical, cost-effective approach to reduce frustration and increase efficiency for BLV users in everyday parcel retrieval tasks.

Abstract

Parcel lockers have become an increasingly prevalent last-mile delivery method. Yet, a recent study revealed its accessibility challenges to blind and low-vision people (BLV). Informed by the study, we designed FetchAid, a standalone intelligent mobile app assisting BLV in using a parcel locker in real-time by integrating computer vision and augmented reality (AR) technologies. FetchAid first uses a deep network to detect the user's fingertip and relevant buttons on the touch screen of the parcel locker to guide the user to reveal and scan the QR code to open the target compartment door and then guide the user to reach the door safely with AR-based context-aware audio feedback. Moreover, FetchAid provides an error-recovery mechanism and real-time feedback to keep the user on track. We show that FetchAid substantially improved task accomplishment and efficiency, and reduced frustration and overall effort in a study with 12 BLV participants, regardless of their vision conditions and previous experience.

FetchAid: Making Parcel Lockers More Accessible to Blind and Low Vision People With Deep-learning Enhanced Touchscreen Guidance, Error-Recovery Mechanism, and AR-based Search Support

TL;DR

FetchAid tackles BLV access barriers in parcel locker use by integrating on-device object detection for touchscreen guidance, OCR-based door localization, and AR-based navigation to the open compartment. The two-phase system delivers real-time, context-aware voice feedback and safety cues, supported by an error-recovery mechanism that helps users recover from incorrect screen actions. Technical evaluations show strong object-detection robustness with data augmentation, high OCR accuracy across angles, and reliable AR navigation, while a user study with 12 BLV participants demonstrates significant improvements in task success and reductions in perceived workload. The work advances accessible interaction design for public touchscreen devices and suggests pathways for generalizing to other machines and open-source development. It provides a practical, cost-effective approach to reduce frustration and increase efficiency for BLV users in everyday parcel retrieval tasks.

Abstract

Parcel lockers have become an increasingly prevalent last-mile delivery method. Yet, a recent study revealed its accessibility challenges to blind and low-vision people (BLV). Informed by the study, we designed FetchAid, a standalone intelligent mobile app assisting BLV in using a parcel locker in real-time by integrating computer vision and augmented reality (AR) technologies. FetchAid first uses a deep network to detect the user's fingertip and relevant buttons on the touch screen of the parcel locker to guide the user to reveal and scan the QR code to open the target compartment door and then guide the user to reach the door safely with AR-based context-aware audio feedback. Moreover, FetchAid provides an error-recovery mechanism and real-time feedback to keep the user on track. We show that FetchAid substantially improved task accomplishment and efficiency, and reduced frustration and overall effort in a study with 12 BLV participants, regardless of their vision conditions and previous experience.
Paper Structure (44 sections, 11 figures, 5 tables)

This paper contains 44 sections, 11 figures, 5 tables.

Figures (11)

  • Figure 1: A Standard HiveBox parcel locker in China. Each parcel locker is modular and consists of several cabinets to the left and right of the center touchscreen interface. Each cabinet consists of two columns and each column contains 11 compartments. The dimensions of each compartment are standardized.
  • Figure 2: Parcel lockers worldwide.
  • Figure 3: Task flow of the HiveBox parcel locker touchscreen. The user starts with the Homepage. To fetch a package from a HiveBox, the user needs to first tap the "Fetch Package" button on the homepage, which consequently reveals the QR Code Page. The user needs to scan the QR code to unlock the door. The QR code will not be revealed if the user clicks the wrong button, which instead reveals a Distraction Page (e.g. an advertisement and a back button), and the user needs to press the "back button" to return to the previous page. The compartment location is shown on the Door Open Page after the QR code is scanned for user authentication.
  • Figure 4: FetchAid system overview: The system assists BLV in two phases, Touchscreen Interaction and Open Door Searching. In the Touchscreen Interaction Phase, FetchAid detects the overview page, tracks the user's fingertip (bounded by the orange-label box), and guides the user to tap "Fetch Package" (bounded by the violet-label box). If the user accidentally taps "Deliver Package" (blue box), FetchAid directs them to the "Back" button (cyan box). After tapping "Fetch Package," a QR code (red box) appears, is scanned, and a target door opens, initiating the Open Door Searching Phase. FetchAid then provides real-time voice feedback based on the user's estimated position from ARKit, guiding them to the open door.
  • Figure 5: The object detection model in FetchAid for identifying graphical elements in the UI. An EfficientNet-V2-based network outputs the bounding boxes of important graphical elements in the UI and the user's pointing finger, which will be used to output voice feedback to instruct the user to click the desired button.
  • ...and 6 more figures