Table of Contents
Fetching ...

MatplotAlt: A Python Library for Adding Alt Text to Matplotlib Figures in Computational Notebooks

Kai Nylund, Jennifer Mankoff, Venkatesh Potluri

TL;DR

MatplotAlt tackles the accessibility gap for Matplotlib figures in computational notebooks by providing heuristic and VLM-based alt-text generation, plus tools to surface descriptions in notebooks and via CLI. It infers chart data and ten figure types from figures, producing configurable L1-L4 captions and optional data tables to support diverse accessibility needs. Evaluation on VisText and Matplotlib Gallery shows heuristics can produce human-like long-form captions and that conditioning VLM generations with heuristic text and data tables reduces factual errors, though risks remain. The work offers practical guidelines for deploying accessible notebook content and highlights tradeoffs between accuracy, verbosity, and trust, inviting collaboration to broaden coverage and reliability.

Abstract

We present MatplotAlt, an open-source Python package for easily adding alternative text to Matplotlib figures. MatplotAlt equips Jupyter notebook authors to automatically generate and surface chart descriptions with a single line of code or command, and supports a range of options that allow users to customize the generation and display of captions based on their preferences and accessibility needs. Our evaluation indicates that MatplotAlt's heuristic and LLM-based methods to generate alt text can create accurate long-form descriptions of both simple univariate and complex Matplotlib figures. We find that state-of-the-art LLMs still struggle with factual errors when describing charts, and improve the accuracy of our descriptions by prompting GPT4-turbo with heuristic-based alt text or data tables parsed from the Matplotlib figure.

MatplotAlt: A Python Library for Adding Alt Text to Matplotlib Figures in Computational Notebooks

TL;DR

MatplotAlt tackles the accessibility gap for Matplotlib figures in computational notebooks by providing heuristic and VLM-based alt-text generation, plus tools to surface descriptions in notebooks and via CLI. It infers chart data and ten figure types from figures, producing configurable L1-L4 captions and optional data tables to support diverse accessibility needs. Evaluation on VisText and Matplotlib Gallery shows heuristics can produce human-like long-form captions and that conditioning VLM generations with heuristic text and data tables reduces factual errors, though risks remain. The work offers practical guidelines for deploying accessible notebook content and highlights tradeoffs between accuracy, verbosity, and trust, inviting collaboration to broaden coverage and reliability.

Abstract

We present MatplotAlt, an open-source Python package for easily adding alternative text to Matplotlib figures. MatplotAlt equips Jupyter notebook authors to automatically generate and surface chart descriptions with a single line of code or command, and supports a range of options that allow users to customize the generation and display of captions based on their preferences and accessibility needs. Our evaluation indicates that MatplotAlt's heuristic and LLM-based methods to generate alt text can create accurate long-form descriptions of both simple univariate and complex Matplotlib figures. We find that state-of-the-art LLMs still struggle with factual errors when describing charts, and improve the accuracy of our descriptions by prompting GPT4-turbo with heuristic-based alt text or data tables parsed from the Matplotlib figure.

Paper Structure

This paper contains 30 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: The MatplotAlt system. Heuristic-based alt text of a given semantic level is generated from Matplotlib figure attributes using the generate_alt_text function. Alternatively, VLM-based descriptions can be generated with the figure as input using generate_api_alt_text. Descriptions are then embedded in the notebook or exported using add_alt_text.
  • Figure 2: MatplotAlt is designed to generate long-form L3 descriptions for Matplotlib figures. VisText VL-T5 captions were shortest on average, followed by human, heuristic, and GPT4-turbo methods (set to a max output length of 225 tokens). Matplotlib gallery alt texts were generally longer than those for VisText figures.