Table of Contents
Fetching ...

Audio Description Customization

Rosiana Natalie, Ruei-Che Chang, Smitha Sheshadri, Anhong Guo, Kotaro Hara

TL;DR

This work addresses the limitation of fixed audio descriptions by exploring end-user customization for BLV viewers. It employs a formative interview study to identify desirable customization properties and then builds CustomAD, a high-fidelity prototype enabling content (length, emphasis) and presentation (speed, voice, tone, gender, syntax) customization. An evaluation with BLV participants shows that customization improves information-seeking accuracy, video understanding, and immersion, albeit with longer task times and manageable cognitive load; usability is rated as excellent ($$SUS=84.23$$). The findings underscore the value of AD customization and outline practical pathways for integration into video players and platforms, as well as automated and collaborative approaches to generating multiple AD variants for diverse contexts.

Abstract

Blind and low-vision (BLV) people use audio descriptions (ADs) to access videos. However, current ADs are unalterable by end users, thus are incapable of supporting BLV individuals' potentially diverse needs and preferences. This research investigates if customizing AD could improve how BLV individuals consume videos. We conducted an interview study (Study 1) with fifteen BLV participants, which revealed desires for customizing properties like length, emphasis, speed, voice, format, tone, and language. At the same time, concerns like interruptions and increased interaction load due to customization emerged. To examine AD customization's effectiveness and tradeoffs, we designed CustomAD, a prototype that enables BLV users to customize AD content and presentation. An evaluation study (Study 2) with twelve BLV participants showed using CustomAD significantly enhanced BLV people's video understanding, immersion, and information navigation efficiency. Our work illustrates the importance of AD customization and offers a design that enhances video accessibility for BLV individuals.

Audio Description Customization

TL;DR

This work addresses the limitation of fixed audio descriptions by exploring end-user customization for BLV viewers. It employs a formative interview study to identify desirable customization properties and then builds CustomAD, a high-fidelity prototype enabling content (length, emphasis) and presentation (speed, voice, tone, gender, syntax) customization. An evaluation with BLV participants shows that customization improves information-seeking accuracy, video understanding, and immersion, albeit with longer task times and manageable cognitive load; usability is rated as excellent (). The findings underscore the value of AD customization and outline practical pathways for integration into video players and platforms, as well as automated and collaborative approaches to generating multiple AD variants for diverse contexts.

Abstract

Blind and low-vision (BLV) people use audio descriptions (ADs) to access videos. However, current ADs are unalterable by end users, thus are incapable of supporting BLV individuals' potentially diverse needs and preferences. This research investigates if customizing AD could improve how BLV individuals consume videos. We conducted an interview study (Study 1) with fifteen BLV participants, which revealed desires for customizing properties like length, emphasis, speed, voice, format, tone, and language. At the same time, concerns like interruptions and increased interaction load due to customization emerged. To examine AD customization's effectiveness and tradeoffs, we designed CustomAD, a prototype that enables BLV users to customize AD content and presentation. An evaluation study (Study 2) with twelve BLV participants showed using CustomAD significantly enhanced BLV people's video understanding, immersion, and information navigation efficiency. Our work illustrates the importance of AD customization and offers a design that enhances video accessibility for BLV individuals.
Paper Structure (44 sections, 7 figures, 3 tables)

This paper contains 44 sections, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Videos and their ADs used in Study 1. Each row represent a video. Video types are: music, instructional, entertainment, campaign, explainer, advertisment, and documentary. Each participant watched all seven videos to increase their awareness of different AD contents and styles.
  • Figure 2: Summary of Likert scale questionnaire responses on customization properties preference by participants. Participants rated how much they agreed that the customization property could help them to consume AD more effectively
  • Figure 3: CustomAD interface consists of a video player (left), which allows users to play, pause, and seek the video, and a customization pane (right) where users can customize the properties of ADs. The customization properties are grouped into content settings and presentation settings. The content customization adjusts the script's content as users change the ADs length and emphasis. In presentation customization, users could adjust speed, voice, tone, gender, and grammatical syntax of the ADs to change how the ADs are read out. Users can also toggle the ADs on and off.
  • Figure 4: Videos used in Study 2. We used six videos of three types: entertainment, explainer, and tutorial. Each row represents a video.
  • Figure 5: Average correctness scores for completing information seeking tasks for different video types (entertainment, explainer, tutorial) and interface conditions (without-customization and with-customization). The vertical line on each bar represents standard deviation.
  • ...and 2 more figures