Table of Contents
Fetching ...

Supporting Multimodal Data Interaction on Refreshable Tactile Displays: An Architecture to Combine Touch and Conversational AI

Samuel Reinders, Munazza Zaib, Matthew Butler, Bongshin Lee, Ingrid Zukerman, Lizhen Qu, Kim Marriott

TL;DR

The paper tackles accessibility barriers in data visualizations for BLV users by integrating a refreshable tactile display with a conversational AI. It presents a three-component architecture—Hardware Devices, Interaction Manager, and Conversational Agent—enabled via MQTT to support multimodal data interaction, including deictic queries grounded in touch. The authors provide an open-source reference implementation that couples the Dot Pad RTD with external touch sensing, Vega-Lite-to-tactile rendering, and a GPT-4o-based agent, achieving synchronized tactile, Braille, and audio feedback. This work lays a technical foundation for multimodal accessible data visualization, offering a path for future user studies, hardware integration, and broader RTD support.

Abstract

Combining conversational AI with refreshable tactile displays (RTDs) offers significant potential for creating accessible data visualization for people who are blind or have low vision (BLV). To support researchers and developers building accessible data visualizations with RTDs, we present a multimodal data interaction architecture along with an open-source reference implementation. Our system is the first to combine touch input with a conversational agent on an RTD, enabling deictic queries that fuse touch context with spoken language, such as "what is the trend between these points?" The architecture addresses key technical challenges, including touch sensing on RTDs, visual-to-tactile encoding, integrating touch context with conversational AI, and synchronizing multimodal output. Our contributions are twofold: (1) a technical architecture integrating RTD hardware, external touch sensing, and conversational AI to enable multimodal data interaction; and (2) an open-source reference implementation demonstrating its feasibility. This work provides a technical foundation to support future research in multimodal accessible data visualization.

Supporting Multimodal Data Interaction on Refreshable Tactile Displays: An Architecture to Combine Touch and Conversational AI

TL;DR

The paper tackles accessibility barriers in data visualizations for BLV users by integrating a refreshable tactile display with a conversational AI. It presents a three-component architecture—Hardware Devices, Interaction Manager, and Conversational Agent—enabled via MQTT to support multimodal data interaction, including deictic queries grounded in touch. The authors provide an open-source reference implementation that couples the Dot Pad RTD with external touch sensing, Vega-Lite-to-tactile rendering, and a GPT-4o-based agent, achieving synchronized tactile, Braille, and audio feedback. This work lays a technical foundation for multimodal accessible data visualization, offering a path for future user studies, hardware integration, and broader RTD support.

Abstract

Combining conversational AI with refreshable tactile displays (RTDs) offers significant potential for creating accessible data visualization for people who are blind or have low vision (BLV). To support researchers and developers building accessible data visualizations with RTDs, we present a multimodal data interaction architecture along with an open-source reference implementation. Our system is the first to combine touch input with a conversational agent on an RTD, enabling deictic queries that fuse touch context with spoken language, such as "what is the trend between these points?" The architecture addresses key technical challenges, including touch sensing on RTDs, visual-to-tactile encoding, integrating touch context with conversational AI, and synchronizing multimodal output. Our contributions are twofold: (1) a technical architecture integrating RTD hardware, external touch sensing, and conversational AI to enable multimodal data interaction; and (2) an open-source reference implementation demonstrating its feasibility. This work provides a technical foundation to support future research in multimodal accessible data visualization.
Paper Structure (12 sections, 2 figures)

This paper contains 12 sections, 2 figures.

Figures (2)

  • Figure 1: The multimodal data interaction architecture, consisting of the three interconnected components--Hardware Devices, Interaction Manager, and Conversational Agent--that coordinate via MQTT.
  • Figure 2: Custom mounting positioning the LMC 20 cm above the Dot Pad RTD at a 35$^\circ$ downward angle for stable tracking.