ShelfHelp: Empowering Humans to Perform Vision-Independent Manipulation Tasks with a Socially Assistive Robotic Cane

Shivendra Agrawal; Suresh Nayak; Ashutosh Naik; Bradley Hayes

ShelfHelp: Empowering Humans to Perform Vision-Independent Manipulation Tasks with a Socially Assistive Robotic Cane

Shivendra Agrawal, Suresh Nayak, Ashutosh Naik, Bradley Hayes

TL;DR

This work tackles independent grocery shopping for people with visual impairments by enabling a robotic cane to locate and retrieve products without relying on full visual perception. ShelfHelp introduces a two-stage visual product locator and a two-planner manipulation guidance system (continuous and discrete) that verbally guides users toward grasping items in dense shelf settings. A pilot study with novice users shows the discrete planner matches human performance in guide time and command count, with the continuous planner offering useful affirmations; both improve independence and privacy compared with staff-based assistance. The system operates offline and is scalable to large product catalogs, addressing privacy and accessibility concerns in real-world grocery environments.

Abstract

The ability to shop independently, especially in grocery stores, is important for maintaining a high quality of life. This can be particularly challenging for people with visual impairments (PVI). Stores carry thousands of products, with approximately 30,000 new products introduced each year in the US market alone, presenting a challenge even for modern computer vision solutions. Through this work, we present a proof-of-concept socially assistive robotic system we call ShelfHelp, and propose novel technical solutions for enhancing instrumented canes traditionally meant for navigation tasks with additional capability within the domain of shopping. ShelfHelp includes a novel visual product locator algorithm designed for use in grocery stores and a novel planner that autonomously issues verbal manipulation guidance commands to guide the user during product retrieval. Through a human subjects study, we show the system's success in locating and providing effective manipulation guidance to retrieve desired products with novice users. We compare two autonomous verbal guidance modes achieving comparable performance to a human assistance baseline and present encouraging findings that validate our system's efficiency and effectiveness and through positive subjective metrics including competence, intelligence, and ease of use.

ShelfHelp: Empowering Humans to Perform Vision-Independent Manipulation Tasks with a Socially Assistive Robotic Cane

TL;DR

Abstract

Paper Structure (26 sections, 1 equation, 13 figures, 1 algorithm)

This paper contains 26 sections, 1 equation, 13 figures, 1 algorithm.

Introduction
Related Work
Manipulation guidance
Product identification
Grocery assistant systems
System Design
Hardware System
Software System
Alignment
Product Detection
Data Association
Spatial Scoring
Planning
Continuous Planner
Dataset from Human Demonstrations:
...and 11 more sections

Figures (13)

Figure 1: ShelfHelp includes a robotic cane equipped with RealSense D455 and T265 cameras. The system is powered through a laptop in a backpack. Left: The system used as a navigational device. It uses audio and haptic feedback for navigation guidance. Right: The system used as a manipulation device. It uses audio for manipulation guidance.
Figure 2: System Diagram. Alignment, perception, planning, and verbal conveyance are executed on a backpack-worn laptop, while all the sensing is mounted on the cane.
Figure 3: Our product search algorithm can reliably locate desired products on a grocery shelf. Regions with a high likelihood of containing any product are proposed in the first stage. The features of these regions are then compared against the target product image. Our data association solution is used to identify whether detections from incoming camera frames are new or re-detections of existing products. The above image shows our algorithm operating within an actual grocery store, where the product classification aspect of this work has been tested and validated. The data association and manipulation assistance components were validated within a lab-based study.
Figure 5: The spatial scoring system clusters all the found instances spatially and gives preference to the closest cluster to the current hand pose. Ties are broken arbitrarily.
Figure 6: (Left to right) A sample of discrete commands. The movement (in meters) each command caused. MDP and solution definition. We train a model of human hand movement from demonstrations that inform the transition probabilities $T$. $S$ defines the state space, $A$ defines the discrete set of verbal actions, and $R$ is the reward function. A policy is learned offline that can be used across reaching tasks.
...and 8 more figures

ShelfHelp: Empowering Humans to Perform Vision-Independent Manipulation Tasks with a Socially Assistive Robotic Cane

TL;DR

Abstract

ShelfHelp: Empowering Humans to Perform Vision-Independent Manipulation Tasks with a Socially Assistive Robotic Cane

Authors

TL;DR

Abstract

Table of Contents

Figures (13)