Table of Contents
Fetching ...

Toward Spatial Temporal Consistency of Joint Visual Tactile Perception in VR Applications

Fuqiang Zhao, Kehan Zhang, Qian Liu, Zhuoyi Lyu

TL;DR

The paper tackles the lack of spatiotemporal alignment between visual textures and vibrotactile signals in VR by introducing a data acquisition system that jointly captures visual texture images and accelerometer-based vibrotactile signals with spatial coordinates. It establishes two mappings—world-to-pixel and taxel-to-world—to achieve pixel-to-taxel alignment, and constructs vibration maps with taxel-level consistency. Validation across seven textures shows that vibration characteristics correlate with surface roughness, quantified by metrics such as V_scale, V_mean, and V_std. The authors also present V-Touching, a VR application that renders aligned visual and tactile feedback via a haptic glove and a client-server pipeline, using Unity and real-world texture data to demonstrate spatiotemporal consistency in tactile rendering. A noted limitation is the single-axis data collection (Y-direction), with future work focusing on multi-direction sampling to improve tactile accuracy and realism.

Abstract

With the development of VR technology, especially the emergence of the metaverse concept, the integration of visual and tactile perception has become an expected experience in human-machine interaction. Therefore, achieving spatial-temporal consistency of visual and tactile information in VR applications has become a necessary factor for realizing this experience. The state-of-the-art vibrotactile datasets generally contain temporal-level vibrotactile information collected by randomly sliding on the surface of an object, along with the corresponding image of the material/texture. However, they lack the position/spatial information that corresponds to the signal acquisition, making it difficult to achieve spatiotemporal alignment of visual-tactile data. Therefore, we develop a new data acquisition system in this paper which can collect visual and vibrotactile signals of different textures/materials with spatial and temporal consistency. In addition, we develop a VR-based application call "V-Touching" by leveraging the dataset generated by the new acquisition system, which can provide pixel-to-taxel joint visual-tactile perception when sliding over the surface of objects in the virtual environment with distinct vibrotactile feedback of different textures/materials.

Toward Spatial Temporal Consistency of Joint Visual Tactile Perception in VR Applications

TL;DR

The paper tackles the lack of spatiotemporal alignment between visual textures and vibrotactile signals in VR by introducing a data acquisition system that jointly captures visual texture images and accelerometer-based vibrotactile signals with spatial coordinates. It establishes two mappings—world-to-pixel and taxel-to-world—to achieve pixel-to-taxel alignment, and constructs vibration maps with taxel-level consistency. Validation across seven textures shows that vibration characteristics correlate with surface roughness, quantified by metrics such as V_scale, V_mean, and V_std. The authors also present V-Touching, a VR application that renders aligned visual and tactile feedback via a haptic glove and a client-server pipeline, using Unity and real-world texture data to demonstrate spatiotemporal consistency in tactile rendering. A noted limitation is the single-axis data collection (Y-direction), with future work focusing on multi-direction sampling to improve tactile accuracy and realism.

Abstract

With the development of VR technology, especially the emergence of the metaverse concept, the integration of visual and tactile perception has become an expected experience in human-machine interaction. Therefore, achieving spatial-temporal consistency of visual and tactile information in VR applications has become a necessary factor for realizing this experience. The state-of-the-art vibrotactile datasets generally contain temporal-level vibrotactile information collected by randomly sliding on the surface of an object, along with the corresponding image of the material/texture. However, they lack the position/spatial information that corresponds to the signal acquisition, making it difficult to achieve spatiotemporal alignment of visual-tactile data. Therefore, we develop a new data acquisition system in this paper which can collect visual and vibrotactile signals of different textures/materials with spatial and temporal consistency. In addition, we develop a VR-based application call "V-Touching" by leveraging the dataset generated by the new acquisition system, which can provide pixel-to-taxel joint visual-tactile perception when sliding over the surface of objects in the virtual environment with distinct vibrotactile feedback of different textures/materials.
Paper Structure (8 sections, 7 equations, 6 figures, 1 table)

This paper contains 8 sections, 7 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: The pipeline of our work. We first collect vibrotactile data with an accelerometer by sliding back and forth on the surface of physical objects. These data are then used to establish a vibration map with spatial alignment toward the captured texture images. We then develop a VR application, called V-Touching, by utilizing the dataset generated by the proposed acquisition system. The user controls a haptic glove to interact with the surface of virtual objects, and the server transmits corresponding vibrotactile signals to the client with satisfactory spatial-temporal alignment of visual-tactile perception.
  • Figure 2: System Setups. The developed system consists of both visual and tactile data acquisition devices. With the origin of the world coordinate mounted on the base of the robotic arm, as shown by the red coordinate system in the bottom left corner.
  • Figure 3: a): The blue dashed line represents the position variation signal acquired by the robot, while the orange line represents the fitted signal of the robot's uniform motion. b): The blue dashed line represents the raw signal collected by the accelerometer, while the orange line represents the intercepted accelerometer signal.
  • Figure 4: Qualitative results of data collection. The top shows the captured texture images, and the bottom displays the generated vibration maps of collected vibrotactile data, where brighter areas indicate stronger intensity.
  • Figure 5: An illustration of the framework of the developed VR application, V-Touching. In this application, the client is able to control the poses of different parts of the hand. The server executes the movement of these poses to touch virtually textured objects. It encodes the vibrotactile signals and transmits them to the client. The client then decodes the signals, resulting in the haptic glove generating vibrations corresponding to specific positions.
  • ...and 1 more figures