Investigating Use Cases of AI-Powered Scene Description Applications for Blind and Low Vision People
Ricardo Gonzalez, Jazmin Collins, Shiri Azenkot, Cynthia Bennett
TL;DR
This study investigates how BLV users use AI-powered scene description apps beyond traditional remote-human assistance. Through a two-week diary study with 16 BLV participants, the authors identify use cases, goals, content types, and contexts, and quantify trust, satisfaction, and accuracy of AI-generated descriptions. They reveal both AI-specific and general BLV visual challenges, showing that accuracy strongly influences trust and satisfaction, while users often rely on their prior knowledge to interpret imperfect outputs. The work highlights design opportunities to tailor AI outputs to user contexts, differentiate AI-enabled use from human assistance, and guide future improvements in AI-driven accessibility tools. The results contribute a detailed use-case taxonomy and practical guidance for building more reliable, user-aligned AI scene-description systems for BLV users, especially as AI capabilities continue to evolve.
Abstract
"Scene description" applications that describe visual content in a photo are useful daily tools for blind and low vision (BLV) people. Researchers have studied their use, but they have only explored those that leverage remote sighted assistants; little is known about applications that use AI to generate their descriptions. Thus, to investigate their use cases, we conducted a two-week diary study where 16 BLV participants used an AI-powered scene description application we designed. Through their diary entries and follow-up interviews, users shared their information goals and assessments of the visual descriptions they received. We analyzed the entries and found frequent use cases, such as identifying visual features of known objects, and surprising ones, such as avoiding contact with dangerous objects. We also found users scored the descriptions relatively low on average, 2.76 out of 5 (SD=1.49) for satisfaction and 2.43 out of 4 (SD=1.16) for trust, showing that descriptions still need significant improvements to deliver satisfying and trustworthy experiences. We discuss future opportunities for AI as it becomes a more powerful accessibility tool for BLV users.
