A framework for improving the accessibility of research papers on arXiv.org
Shamsi Brinn, Christopher Cameron, David Fielding, Charles Frankston, Alison Fromme, Peter Huang, Mark Nazzaro, Stephanie Orphan, Steinn Sigurdsson, Ryan Tay, Miranda Yang, Qianyu Zhou
TL;DR
The paper argues that open access alone is insufficient when readers with disabilities cannot access content. It uses a mixed-methods study (survey and interviews) to diagnose barriers, finding that PDFs are the major impediment and that well-structured HTML offers broad accessibility and machine-readability benefits. The proposed solution is to publish well-formatted HTML alongside existing PDF and TeX sources, embedding ARIA semantics and integrating author-contributed accessibility details into the submission workflow. This work outlines a feasible, albeit challenging, path for arXiv to advance inclusive open science through pipeline-enabled HTML delivery.
Abstract
The research content hosted by arXiv is not fully accessible to everyone due to disabilities and other barriers. This matters because a significant proportion of people have reading and visual disabilities, it is important to our community that arXiv is as open as possible, and if science is to advance, we need wide and diverse participation. In addition, we have mandates to become accessible, and accessible content benefits everyone. In this paper, we will describe the accessibility problems with research, review current mitigations (and explain why they aren't sufficient), and share the results of our user research with scientists and accessibility experts. Finally, we will present arXiv's proposed next step towards more open science: offering HTML alongside existing PDF and TeX formats. An accessible HTML version of this paper is also available at https://info.arxiv.org/about/accessibility_research_report.html
