Table of Contents
Fetching ...

Exploring Human-AI Perception Alignment in Sensory Experiences: Do LLMs Understand Textile Hand?

Shu Zhong, Elia Gatti, Youngjun Cho, Marianna Obrist

TL;DR

This study probes perceptual alignment between humans and LLMs in tactile texture experiences using a textile-hand paradigm. It introduces an interactive Guess What Textile task in which participants describe how a target textile feels to an LLM, which then identifies the target from a fixed embedding space of 20 textiles. Across 40 participants and 80 tasks, the results show modest overall alignment (18/80 correct, 22.5%), with strong textile-specific biases (e.g., silk satin scored highly while cotton denim did not). The work combines objective success with subjective validity and similarity assessments, revealing polarization in human judgments and pointing to data-driven limitations in tactile language and embeddings. It also outlines future directions, including leveraging multimodal LLMs to improve perceptual alignment in everyday tactile interactions.

Abstract

Aligning large language models (LLMs) behaviour with human intent is critical for future AI. An important yet often overlooked aspect of this alignment is the perceptual alignment. Perceptual modalities like touch are more multifaceted and nuanced compared to other sensory modalities such as vision. This work investigates how well LLMs align with human touch experiences using the "textile hand" task. We created a "Guess What Textile" interaction in which participants were given two textile samples -- a target and a reference -- to handle. Without seeing them, participants described the differences between them to the LLM. Using these descriptions, the LLM attempted to identify the target textile by assessing similarity within its high-dimensional embedding space. Our results suggest that a degree of perceptual alignment exists, however varies significantly among different textile samples. For example, LLM predictions are well aligned for silk satin, but not for cotton denim. Moreover, participants didn't perceive their textile experiences closely matched by the LLM predictions. This is only the first exploration into perceptual alignment around touch, exemplified through textile hand. We discuss possible sources of this alignment variance, and how better human-AI perceptual alignment can benefit future everyday tasks.

Exploring Human-AI Perception Alignment in Sensory Experiences: Do LLMs Understand Textile Hand?

TL;DR

This study probes perceptual alignment between humans and LLMs in tactile texture experiences using a textile-hand paradigm. It introduces an interactive Guess What Textile task in which participants describe how a target textile feels to an LLM, which then identifies the target from a fixed embedding space of 20 textiles. Across 40 participants and 80 tasks, the results show modest overall alignment (18/80 correct, 22.5%), with strong textile-specific biases (e.g., silk satin scored highly while cotton denim did not). The work combines objective success with subjective validity and similarity assessments, revealing polarization in human judgments and pointing to data-driven limitations in tactile language and embeddings. It also outlines future directions, including leveraging multimodal LLMs to improve perceptual alignment in everyday tactile interactions.

Abstract

Aligning large language models (LLMs) behaviour with human intent is critical for future AI. An important yet often overlooked aspect of this alignment is the perceptual alignment. Perceptual modalities like touch are more multifaceted and nuanced compared to other sensory modalities such as vision. This work investigates how well LLMs align with human touch experiences using the "textile hand" task. We created a "Guess What Textile" interaction in which participants were given two textile samples -- a target and a reference -- to handle. Without seeing them, participants described the differences between them to the LLM. Using these descriptions, the LLM attempted to identify the target textile by assessing similarity within its high-dimensional embedding space. Our results suggest that a degree of perceptual alignment exists, however varies significantly among different textile samples. For example, LLM predictions are well aligned for silk satin, but not for cotton denim. Moreover, participants didn't perceive their textile experiences closely matched by the LLM predictions. This is only the first exploration into perceptual alignment around touch, exemplified through textile hand. We discuss possible sources of this alignment variance, and how better human-AI perceptual alignment can benefit future everyday tasks.
Paper Structure (28 sections, 4 equations, 9 figures)

This paper contains 28 sections, 4 equations, 9 figures.

Figures (9)

  • Figure 1: The overall design of the "Guess What Textile?" task. Participants touch two textiles (a target and a reference textile) placed inside a box to hide any visual influences. The AI guessing system knows only the reference textile and is required to make a prediction of the target textile based on participants' descriptions. The task is iterative, and stops only when a correct prediction is made or when the maximum number of five attempts is reached. ASR stands for Automated Speech Recognition.
  • Figure 2: Overview of the user study setup: (a) A participant sitting comfortably at a desk and putting the hands through dedicated openings in a black box that contains the textile samples. Opposite the participant, the researcher provides instructions and provides participants with the textiles selected by the AI system as part of the study task. Each textile sample has a unique ID. (b) A participant handling one textile sample in each hand, one representing the reference textile (starting point for the "Guess What Textile" task) and the target textile described to the AI system.
  • Figure 3: Overview of the 20 textile samples selected for the user study.
  • Figure 4: An overview of the AI Guessing System, i.e. "Guess What Textile?". The vector search process uses pre-built embeddings for 20 textile samples and compares them with a user query-generated vector to identify the best matching textile ID.
  • Figure 5: Screenshots of the user interface used in the user study. a) The AI system assigns two textile samples to a participants; b) Participants describes their textile experience of the target textile; c) the system makes a prediction for the target textile based on participant's descriptions; d) participant rates the AI's response validity based on a scale from 1 to 10 orally; e) participant provides a similarity score on a scale from 1 to 10 orally, comparing the AI-guessed textile to the actual target textile. Participants responses are captured and entered by the researcher sitting opposite the participant as shown in Figure \ref{['fig:setup']}.
  • ...and 4 more figures