Table of Contents
Fetching ...

ArrayTac: A tactile display for simultaneous rendering of shape, stiffness and friction

Tianhai Liang, Shiyi Guo, Baiye Cheng, Zhengrong Xue, Han Zhang, Huazhe Xu

Abstract

Human-computer interaction in the visual and auditory domains has achieved considerable maturity, yet machine-to-human tactile feedback remains underdeveloped. Existing tactile displays struggle to simultaneously render multiple tactile dimensions, such as shape, stiffness, and friction, which limits the realism of haptic simulation. Here, we present ArrayTac, a piezoelectric-driven tactile display capable of simultaneously rendering shape, stiffness, and friction to reproduce realistic haptic signals. The system comprises a 4x4 array of 16 actuator units, each employing a three-stage micro-lever mechanism to amplify the micrometer-scale displacement of the piezoelectric element, with Hall sensor-based closed-loop control at the end effector to enhance response speed and precision. We further implement two end-to-end pipelines: 1) a vision-to-touch framework that converts visual inputs into tactile signals using multimodal foundation models, and 2) a real-time tele-palpation system operating over distances of several thousand kilometers. In user studies, first-time participants accurately identify object shapes and physical properties with high success rates. In a tele-palpation experiment over 1,000km, untrained volunteers correctly identified both the number and type of tumors in a breast phantom with 100% accuracy and precisely localized their positions. The system pioneers a new pathway for high-fidelity haptic feedback by introducing the unprecedented capability to simultaneously render an object's shape, stiffness, and friction, delivering a holistic tactile experience that was previously unattainable.

ArrayTac: A tactile display for simultaneous rendering of shape, stiffness and friction

Abstract

Human-computer interaction in the visual and auditory domains has achieved considerable maturity, yet machine-to-human tactile feedback remains underdeveloped. Existing tactile displays struggle to simultaneously render multiple tactile dimensions, such as shape, stiffness, and friction, which limits the realism of haptic simulation. Here, we present ArrayTac, a piezoelectric-driven tactile display capable of simultaneously rendering shape, stiffness, and friction to reproduce realistic haptic signals. The system comprises a 4x4 array of 16 actuator units, each employing a three-stage micro-lever mechanism to amplify the micrometer-scale displacement of the piezoelectric element, with Hall sensor-based closed-loop control at the end effector to enhance response speed and precision. We further implement two end-to-end pipelines: 1) a vision-to-touch framework that converts visual inputs into tactile signals using multimodal foundation models, and 2) a real-time tele-palpation system operating over distances of several thousand kilometers. In user studies, first-time participants accurately identify object shapes and physical properties with high success rates. In a tele-palpation experiment over 1,000km, untrained volunteers correctly identified both the number and type of tumors in a breast phantom with 100% accuracy and precisely localized their positions. The system pioneers a new pathway for high-fidelity haptic feedback by introducing the unprecedented capability to simultaneously render an object's shape, stiffness, and friction, delivering a holistic tactile experience that was previously unattainable.
Paper Structure (33 sections, 24 equations, 10 figures, 1 table)

This paper contains 33 sections, 24 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: The ArrayTac system: architecture, hardware, and control pipeline. (A) This panel illustrates the system workflow, showing the data pipeline from multi-source tactile data collection to haptic rendering. (B) This photograph shows the complete experimental setup with its main hardware components. (C) Schematic diagram of a single actuator unit mechanism. (D) An external view of a fully assembled actuator unit. (E) An internal perspective view of an actuator unit. (F) This is the information flow block diagram, which details the information flow and communication between components. The diagram shows user input from the XYZ sliding platform, high-frequency communication between the display array and drive circuit (10 kHz), and communication with the upper computer. This system can be seamlessly integrated with any Human-Computer Interaction (HCI) algorithm in a plug-and-play manner.
  • Figure 2: Unit control method and performance. (A) Electromagnetic field simulation process performed using Ansys EDT. (B) Simulated normal magnetic flux density ($B_n$) detected by the Hall sensor as a function of end-effector position at various sensor installation angles (0° to 40°). The solid lines are cubic polynomial fits to the data, with the coefficient of determination ($R^2$) for each fit shown in the legend. The consistently high $R^2$ values demonstrate the model's robustness against potential assembly misalignments of the sensor ($n=5$). (C) Experimentally measured relationship between Hall sensor feedback and end-effector displacement for each unit. For clarity, only diagonal units are plotted, while the heatmap reports the coefficient of determination ($R^2$) for all 16 units ($n=3$). For (B) and (C), data points are shown as mean ± s.d., where $n$ represents the number of samples. (D) Step response of a single display unit. (E) Bode plot of a single display unit under open-loop control. (F) Bode plot of a single display unit under PID closed-loop control. For (E) and (F), $\omega_{BW}$ denotes the -3 dB magnitude frequency and $\omega_{PW}$ denotes the -90$^\circ$ phase-lag frequency.
  • Figure 3: Volunteer experiments on shape, stiffness, and friction perception. (A) 3D models used in the shape discrimination experiment. (B) Zero-shot performance scores of volunteers during their first use of the device without any prior training ($n=22$). In the box plot of (B), the central mark indicates the median, and the red '+' symbol indicates the mean. The bottom and top edges of the box represent the 25th and 75th percentiles, respectively. The whiskers extend to the maximum and minimum values excluding the outliers, which are plotted individually using the orange points. (C) Confusion matrix of shape identification when volunteers were provided only with the list of candidate options ($n=22$). (D) Confusion matrix of shape identification after volunteers completed training ($n=22$). (E) Pairwise preference heatmap for stiffness discrimination, illustrating the proportion of trials where $k_1$ was judged stiffer than $k_2$ (n=$22\times5$). (F) Confusion matrix of classification across different stiffness levels (n=$22\times5$). (G) Pairwise preference heatmap for friction discrimination, illustrating the proportion of trials where $f_1$ was judged rougher than $f_2$ (n=$22\times5$). (H) Confusion matrix of classification across different friction levels (n=$22\times5$).
  • Figure 4: The Tac-Anything framework: architecture, experimental validation, and performance.(A) An overview of the Tac-Anything framework, which extracts and renders multidimensional tactile semantics (shape, stiffness, and friction) from a single RGB image. (B) The eight real-world objects with diverse tactile properties used in the user study. (C) The two photographic scenes used as the ground truth. (D) Average Sketching maps of the Haptic Scene Sketching task (n=22). Participant sketches, including their normalized annotations for shape, stiffness, and friction, are compared against the ground truth tactile maps for both scenes. (E) Quantitative analysis of user performance presented as box plots ($n=22$). The left plot shows the Intersection over Union (IoU) for the Haptic Scene Sketching task for each object and the result for all objects (’all’). The right plot shows the object placement accuracy for the Object Identification and Placement task across four conditions: Scene 1, Unrestricted Placement (S1UP); Scene 1, Constrained Placement (S1CP); Scene 2, Unrestricted Placement (S2UP); and Scene 2, Constrained Placement (S2CP). (F) Average placement maps for the Object Identification and Placement task ($n=22$), aggregating the final objects and positions chosen by all participants.
  • Figure 5: Experimental Setup and Results for the Tele-Palpation Task. (A) The architecture of the Tele-Touch system. Operators can use the ArrayTac interface locally to manipulate robotic arms located anywhere in the world and perceive remote tactile information. Data streams are transmitted through a cloud server to enable real-time interaction. (B) Internal design of the two tumor phantom tissue models. The left phantom was used for the "Tumor Localization" task, while the right phantom was used for the "Property Discrimination" task. (C) The graphical user interface (GUI) for system control. (D) Screenshot of the real-time video viewed by participants during the experiment, showing a GelSight tactile sensor mounted on the robotic arm’s end effector. (E) Visualized results for the tumor localization task ($n=22$). (F) Quantitative analysis of localization error ($n=22$). The box plot shows the distance between participant-identified centers and the ground truth for two targets and the aggregated data. (G) Pairwise preference heatmap for the tumor severity ranking task ($n=22$). Each cell value represents the percentage of participants who perceived "Compared Tumor 2" (Y-axis) as more severe/harder than "Compared Tumor 1" (X-axis). (H) Distribution of Kendall's tau ($\tau$) coefficients for the severity ranking data ($n=22$).
  • ...and 5 more figures