Helping Blind People Grasp: Enhancing a Tactile Bracelet with an Automated Hand Navigation System
Marcin Furtak, Florian Pätzold, Tim Kietzmann, Silke M. Kärcher, Peter König
TL;DR
This work presents an AI-driven automated hand navigation system (HANS) that augments a tactile bracelet to help visually impaired users grasp target objects without an external operator. By leveraging dual YOLOv5 detectors for objects and hands, a StrongSORT tracker, and monocular depth estimation, the system translates visual cues into vibration-based guidance, enabling autonomous navigation in cluttered environments and around obstacles. Across grasping, multi-object tracking, and depth-navigation tasks, the approach achieves robust object localization, high success rates (notably 75% overall with expert users), and positive subjective feedback from both blindfolded and blind participants, including a real-world cafe test. The results suggest a viable path toward independent daily use, with identified bottlenecks in speed and hardware scalability and clear directions toward software/hardware optimizations and natural language interfaces for target specification.
Abstract
Grasping constitutes a critical challenge for visually impaired people. To address this problem, we developed a tactile bracelet that assists in grasping by guiding the user's hand to a target object using vibration commands. Here we demonstrate the fully automated system around the bracelet, which can confidently detect and track target and distractor objects and reliably guide the user's hand. We validate our approach in three tasks that resemble complex, everyday use cases. In a grasping task, the participants grasp varying target objects on a table, guided via the automated hand navigation system. In the multiple objects task, participants grasp objects from the same class, demonstrating our system's ability to track one specific object without targeting surrounding distractor objects. Finally, the participants grasp one specific target object by avoiding an obstacle along the way in the depth navigation task, showcasing the potential to utilize our system's depth estimations to navigate even complex scenarios. Additionally, we demonstrate that the system can aid users in the real world by testing it in a less structured environment with a blind participant. Overall, our results demonstrate that the system, by translating the AI-processed visual inputs into a reduced data rate of actionable signals, enables autonomous behavior in everyday environments, thus potentially increasing the quality of life of visually impaired people.
