Measuring and identifying factors of individuals' trust in Large Language Models
Edoardo Sebastiano De Duro, Giuseppe Alessandro Veltri, Hudson Golino, Massimo Stella
TL;DR
This study develops and validates the Trust-In-LLMs Index (TILLMI), a two-factor psychometric tool for measuring trust in large language models, grounded in McAllister's affective and cognitive trust. By combining item generation with LLM-simulated validity and extensive human data, the authors demonstrate a robust two-factor structure comprising closeness with LLMs (affective) and reliance on LLMs (cognitive), supported by CFA, reliability, and convergent/divergent validity analyses. The scale correlates meaningfully with personality traits, cognitive flexibility, and emotional distress measures, and reveals demographic differences (younger males reporting higher trust) as well as higher trust among LLM users versus non-users. These findings offer a quantitative foundation to study and design AI-mediated verbal interactions, guiding responsible deployment and balanced human–AI collaboration while highlighting future cross-cultural and context-specific extensions.
Abstract
Large Language Models (LLMs) can engage in human-looking conversational exchanges. Although conversations can elicit trust between users and LLMs, scarce empirical research has examined trust formation in human-LLM contexts, beyond LLMs' trustworthiness or human trust in AI in general. Here, we introduce the Trust-In-LLMs Index (TILLMI) as a new framework to measure individuals' trust in LLMs, extending McAllister's cognitive and affective trust dimensions to LLM-human interactions. We developed TILLMI as a psychometric scale, prototyped with a novel protocol we called LLM-simulated validity. The LLM-based scale was then validated in a sample of 1,000 US respondents. Exploratory Factor Analysis identified a two-factor structure. Two items were then removed due to redundancy, yielding a final 6-item scale with a 2-factor structure. Confirmatory Factor Analysis on a separate subsample showed strong model fit ($CFI = .995$, $TLI = .991$, $RMSEA = .046$, $p_{X^2} > .05$). Convergent validity analysis revealed that trust in LLMs correlated positively with openness to experience, extraversion, and cognitive flexibility, but negatively with neuroticism. Based on these findings, we interpreted TILLMI's factors as "closeness with LLMs" (affective dimension) and "reliance on LLMs" (cognitive dimension). Younger males exhibited higher closeness with- and reliance on LLMs compared to older women. Individuals with no direct experience with LLMs exhibited lower levels of trust compared to LLMs' users. These findings offer a novel empirical foundation for measuring trust in AI-driven verbal communication, informing responsible design, and fostering balanced human-AI collaboration.
