Table of Contents
Fetching ...

Alt4Blind: A User Interface to Simplify Charts Alt-Text Creation

Omar Moured, Shahid Ali Farooqui, Karin Muller, Sharifeh Fadaeijouybari, Thorsten Schwarz, Mohammed Javed, Rainer Stiefelhagen

TL;DR

The paper addresses the difficulty of producing high-quality chart alt-text for blind and visually impaired readers, where manual descriptions are accurate but costly and AI-generated captions risk inaccuracies. It introduces Alt4Blind, a retrieval-based UI that presents users with semantically similar, human-authored chart alt-text as references while authors craft their own descriptions. It builds a dataset of 5,000 chart images with semantically labeled alt-texts and trains a CLIP-based image-text retrieval model to rank and retrieve visually and textually similar charts, presenting top candidates in the UI. Preliminary user studies indicate the interface supports both novices and experts, with potential for improved accessibility through captions integration and more extensive evaluations.

Abstract

Alternative Texts (Alt-Text) for chart images are essential for making graphics accessible to people with blindness and visual impairments. Traditionally, Alt-Text is manually written by authors but often encounters issues such as oversimplification or complication. Recent trends have seen the use of AI for Alt-Text generation. However, existing models are susceptible to producing inaccurate or misleading information. We address this challenge by retrieving high-quality alt-texts from similar chart images, serving as a reference for the user when creating alt-texts. Our three contributions are as follows: (1) we introduce a new benchmark comprising 5,000 real images with semantically labeled high-quality Alt-Texts, collected from Human Computer Interaction venues. (2) We developed a deep learning-based model to rank and retrieve similar chart images that share the same visual and textual semantics. (3) We designed a user interface (UI) to facilitate the alt-text creation process. Our preliminary interviews and investigations highlight the usability of our UI. For the dataset and further details, please refer to our project page: https://moured.github.io/alt4blind/.

Alt4Blind: A User Interface to Simplify Charts Alt-Text Creation

TL;DR

The paper addresses the difficulty of producing high-quality chart alt-text for blind and visually impaired readers, where manual descriptions are accurate but costly and AI-generated captions risk inaccuracies. It introduces Alt4Blind, a retrieval-based UI that presents users with semantically similar, human-authored chart alt-text as references while authors craft their own descriptions. It builds a dataset of 5,000 chart images with semantically labeled alt-texts and trains a CLIP-based image-text retrieval model to rank and retrieve visually and textually similar charts, presenting top candidates in the UI. Preliminary user studies indicate the interface supports both novices and experts, with potential for improved accessibility through captions integration and more extensive evaluations.

Abstract

Alternative Texts (Alt-Text) for chart images are essential for making graphics accessible to people with blindness and visual impairments. Traditionally, Alt-Text is manually written by authors but often encounters issues such as oversimplification or complication. Recent trends have seen the use of AI for Alt-Text generation. However, existing models are susceptible to producing inaccurate or misleading information. We address this challenge by retrieving high-quality alt-texts from similar chart images, serving as a reference for the user when creating alt-texts. Our three contributions are as follows: (1) we introduce a new benchmark comprising 5,000 real images with semantically labeled high-quality Alt-Texts, collected from Human Computer Interaction venues. (2) We developed a deep learning-based model to rank and retrieve similar chart images that share the same visual and textual semantics. (3) We designed a user interface (UI) to facilitate the alt-text creation process. Our preliminary interviews and investigations highlight the usability of our UI. For the dataset and further details, please refer to our project page: https://moured.github.io/alt4blind/.
Paper Structure (18 sections, 2 figures)

This paper contains 18 sections, 2 figures.

Figures (2)

  • Figure 1: Alt4Blind UI: (1) Menu bar offering access to guidelines and a tutorial. (2) Space for uploaded images featuring a function bar (zoom, move, fit). (3) Text field for user input, accompanied by a button to update the retrieved image. (4) Retrieved charts based on the uploaded image, can be further enhanced with text query.
  • Figure 2: Our retrieval system leverages both the text and image encoder modules of the fine-tuned CLIP model. This ensures similarity at both visual and contextual levels.