Table of Contents
Fetching ...

GuideTWSI: A Diverse Tactile Walking Surface Indicator Dataset from Synthetic and Real-World Images for Blind and Low-Vision Navigation

Hochul Hwang, Soowan Yang, Anh N. H. Nguyen, Parth Goel, Krisha Adhikari, Sunghoon I. Lee, Joydeep Biswas, Nicholas A. Giudice, Donghyun Kim

Abstract

Tactile Walking Surface Indicators (TWSIs) are safety-critical landmarks that blind and low-vision (BLV) pedestrians use to locate crossings and hazard zones. From our observation sessions with BLV guide dog handlers, trainers, and an O&M specialist, we confirmed the critical importance of reliable and accurate TWSI segmentation for navigation assistance of BLV individuals. Achieving such reliability requires large-scale annotated data. However, TWSIs are severely underrepresented in existing urban perception datasets, and even existing dedicated paving datasets are limited: they lack robot-relevant viewpoints (e.g., egocentric or top-down) and are geographically biased toward East Asian directional bars - raised parallel strips used for continuous guidance along sidewalks. This narrow focus overlooks truncated domes - rows of round bumps used primarily in North America and Europe as detectable warnings at curbs, crossings, and platform edges. As a result, models trained only on bar-centric data struggle to generalize to dome-based warnings, leading to missed detections and false stops in safety-critical environments.

GuideTWSI: A Diverse Tactile Walking Surface Indicator Dataset from Synthetic and Real-World Images for Blind and Low-Vision Navigation

Abstract

Tactile Walking Surface Indicators (TWSIs) are safety-critical landmarks that blind and low-vision (BLV) pedestrians use to locate crossings and hazard zones. From our observation sessions with BLV guide dog handlers, trainers, and an O&M specialist, we confirmed the critical importance of reliable and accurate TWSI segmentation for navigation assistance of BLV individuals. Achieving such reliability requires large-scale annotated data. However, TWSIs are severely underrepresented in existing urban perception datasets, and even existing dedicated paving datasets are limited: they lack robot-relevant viewpoints (e.g., egocentric or top-down) and are geographically biased toward East Asian directional bars - raised parallel strips used for continuous guidance along sidewalks. This narrow focus overlooks truncated domes - rows of round bumps used primarily in North America and Europe as detectable warnings at curbs, crossings, and platform edges. As a result, models trained only on bar-centric data struggle to generalize to dome-based warnings, leading to missed detections and false stops in safety-critical environments.
Paper Structure (27 sections, 5 figures, 4 tables)

This paper contains 27 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Objective of the proposed research and tactile walking surface indicator (TWSI). (1) A guide dog stopping at a TWSI and curb, (2) a robot mimicking this stop behavior, and (3) different TWSI types.
  • Figure 2: Photorealistic synthetic tactile paving dataset generation. Our pipeline uses various sidewalk environments in Unreal Engine 4 with varying viewpoints, lighting, and weather conditions to simulate real-world variability. We use AirSim for automatic annotation of depth, instance masks, and bounding boxes. This enables the creation of a high-quality dataset of 15K samples that significantly enhances TWSI segmentation when fused with real-world data.
  • Figure 3: Real robot data collection and hardware experiment. (a) We collected data with a quadruped robot across multiple sites in suburban, campus, and rural environments featuring various truncated domes. Data were gathered at different times of day under diverse truncated dome appearances. (b) We evaluated the robot's reliable stopping at truncated domes in unseen areas. We present the hardware experiment setup and segmentation inference results.
  • Figure 4: Qualitative comparison of segmentation models trained on real data only vs. real + synthetic data. Each row shows a test sample alongside predictions from YOLOv11-seg-N, YOLOv11-seg-X, and DINOv3+EOMT. Highlighted regions in the images show that models trained with synthetic data produce sharper boundaries and fewer missed detections, especially under challenging textures and lighting.
  • Figure 5: Hardware configuration and segmentation visualization. (Top) Hardware configuration with a downward-facing RGB camera mounted at a $70^{\circ}$ angle. (Bottom) Segmentation masks from the fine-tuned model trigger a stop command as the robot walks forward at 0.8ms.