Towards More Accessible Scientific PDFs for People with Visual Impairments: Step-by-Step PDF Remediation to Improve Tag Accuracy
Felix M. Schmitt-Koopmann, Elaine M. Huang, Hans-Peter Hutter, Alireza Darvishy
TL;DR
This study presents PAVE 2.0, a semi-automatic eight-step PDF remediation pipeline designed to improve tag accuracy for accessible scientific PDFs without requiring deep expert knowledge. In a user study with nineteen participants, PAVE 2.0 substantially outperformed Adobe Acrobat Pro in tag accuracy for both novice and experienced remediators, while also reducing cognitive load through a guided interface and AI-assisted features for regions, reading order, headings, tables, lists, figures, and mathematical formulas. Key innovations include an AI-based formula transcription to LaTeX with a MathSpeak-based alt text generator, and a thirteen-criterion manual tag-accuracy scoring scheme to rigorously evaluate structural tagging beyond standard checkers. The results suggest significant practical impact for enabling broader access to scientific literature, with recommendations for integrating remediation into publishing workflows and for refining AI-assisted components to balance automation with user verification.
Abstract
PDF inaccessibility is an ongoing challenge that hinders individuals with visual impairments from reading and navigating PDFs using screen readers. This paper presents a step-by-step process for both novice and experienced users to create accessible PDF documents, including an approach for creating alternative text for mathematical formulas without expert knowledge. In a study involving nineteen participants, we evaluated our prototype PAVE 2.0 by comparing it against Adobe Acrobat Pro, the existing standard for remediating PDFs. Our study shows that experienced users improved their tagging scores from 42.0% to 80.1%, and novice users from 39.2% to 75.2% with PAVE 2.0. Overall, fifteen participants stated that they would prefer to use PAVE 2.0 in the future, and all participants would recommend it for novice users. Our work demonstrates PAVE 2.0's potential for increasing PDF accessibility for people with visual impairments and highlights remaining challenges.
