Table of Contents
Fetching ...

Listen to Your Map: An Online Representation for Spatial Sonification

Lan Wu, Craig Jin, Monisha Mushtary Uttsha, Teresa Vidal-Calleja

TL;DR

The paper addresses the challenge of transforming 3D scene geometry into intuitive auditory cues for navigation aids, especially for visually impaired users. It presents an online, sensor-centric mapping framework that combines VDB-GPDF-based distance fields with 2D circular and 3D cylindrical rasterisations and BRIR-based binaural sonification to produce a real-time 360-degree auditory representation. Key contributions include an integrated online SLAM (VINS-RGBD) with a VDB-GPDF map, efficient circular/cylindrical projections that preserve angular and distance information, and a BRIR-driven audio rendering pipeline, along with quantitative and qualitative evaluations in static and dynamic scenarios. The results demonstrate improved accuracy, coverage, and timing for sonification-ready maps, highlighting the framework’s potential to enhance spatial awareness in real-world navigation tasks.

Abstract

Robotic perception is becoming a key technology for navigation aids, especially helping individuals with visual impairments through spatial sonification. This paper introduces a mapping representation that accurately captures scene geometry for sonification, turning physical spaces into auditory experiences. Using depth sensors, we encode an incrementally built 3D scene into a compact 360-degree representation with angular and distance information, aligning this way with human auditory spatial perception. The proposed framework performs localisation and mapping via VDB-Gaussian Process Distance Fields for efficient online scene reconstruction. The key aspect is a sensor-centric structure that maintains either a 2D-circular or 3D-cylindrical raster-based projection. This spatial representation is then converted into binaural auditory signals using simple pre-recorded responses from a representative room. Quantitative and qualitative evaluations show improvements in accuracy, coverage, timing and suitability for sonification compared to other approaches, with effective handling of dynamic objects as well. An accompanying video demonstrates spatial sonification in room-like environments. https://tinyurl.com/ListenToYourMap

Listen to Your Map: An Online Representation for Spatial Sonification

TL;DR

The paper addresses the challenge of transforming 3D scene geometry into intuitive auditory cues for navigation aids, especially for visually impaired users. It presents an online, sensor-centric mapping framework that combines VDB-GPDF-based distance fields with 2D circular and 3D cylindrical rasterisations and BRIR-based binaural sonification to produce a real-time 360-degree auditory representation. Key contributions include an integrated online SLAM (VINS-RGBD) with a VDB-GPDF map, efficient circular/cylindrical projections that preserve angular and distance information, and a BRIR-driven audio rendering pipeline, along with quantitative and qualitative evaluations in static and dynamic scenarios. The results demonstrate improved accuracy, coverage, and timing for sonification-ready maps, highlighting the framework’s potential to enhance spatial awareness in real-world navigation tasks.

Abstract

Robotic perception is becoming a key technology for navigation aids, especially helping individuals with visual impairments through spatial sonification. This paper introduces a mapping representation that accurately captures scene geometry for sonification, turning physical spaces into auditory experiences. Using depth sensors, we encode an incrementally built 3D scene into a compact 360-degree representation with angular and distance information, aligning this way with human auditory spatial perception. The proposed framework performs localisation and mapping via VDB-Gaussian Process Distance Fields for efficient online scene reconstruction. The key aspect is a sensor-centric structure that maintains either a 2D-circular or 3D-cylindrical raster-based projection. This spatial representation is then converted into binaural auditory signals using simple pre-recorded responses from a representative room. Quantitative and qualitative evaluations show improvements in accuracy, coverage, timing and suitability for sonification compared to other approaches, with effective handling of dynamic objects as well. An accompanying video demonstrates spatial sonification in room-like environments. https://tinyurl.com/ListenToYourMap

Paper Structure

This paper contains 17 sections, 9 figures.

Figures (9)

  • Figure 1: Our sensor-centric representation for spatial sonification. a) 2D circular representation and b) 3D cylindrical representation for the Cow and Lady dataset. The circle is coloured by the distance to the obstacle along each azimuthal angle. We also show the selected close points on the surface. Similarly, for the cylinder, we illustrate the distance up to a certain height.
  • Figure 2: The sound environment used for the BRIR recordings is shown. The HATS manikin is at the lower right.
  • Figure 3: The efficiency performance of our proposed representation for spatial sonification with respect to different voxel resolutions.
  • Figure 4: The efficiency performance of our proposed representation for spatial sonification with respect to different voxel resolutions.
  • Figure 5: Quantitative comparisons of the accuracy in RMSE on the cow and lady dataset with varying voxel resolutions.
  • ...and 4 more figures