Campus AI vs. Commercial AI: Comparing How Students and Employees Perceive their University's LLM Chatbot vs. ChatGPT
Leon Hannig, Annika Bush, Meltem Aksoy, Tim Trappen, Steffen Becker, Greta Ontrup
TL;DR
The paper examines how a university-provided customized LLMaaS chatbot differs from the commercial ChatGPT in user trust, perceived privacy, hallucination perceptions, and sustainability-aware use. Grounded in the Trustworthiness Assessment Model, it theorizes that front-end cues (branding, interface) shape user perceptions, potentially creating calibrated or miscalibrated trust. In a field study with 526 participants (including 116 who used both systems), the university chatbot yielded higher trust, lower perceived privacy concerns, and fewer perceived hallucinations than ChatGPT, though objective hallucination benchmarks suggested mixed or higher hallucination tendencies for the customized system. The study highlights the importance of careful cue design, persistent hallucination warnings, and transparency to ensure appropriate user behavior, and it outlines avenues for causal and mechanism-focused research to better align perception with system capabilities. Practically, it provides actionable guidance for deploying LLMaaS in universities to support safe, informed, and sustainable AI use while acknowledging the gap between perception and objective model behavior.
Abstract
As the use of LLM chatbots by students and researchers becomes more prevalent, universities are pressed to develop AI strategies. One strategy that many universities pursue is to customize pre-trained LLM as-a-service (LLMaaS). While most studies on LLMaaS chatbots prioritize technical adaptations, we focus on psychological effects of user-salient customizations, such as interface changes. We assume that such customizations influence users' perception of the system and are therefore important in guiding safe and appropriate use. In a field study, we examine how students and employees (N = 526) at a German university perceive and use their institution's customized LLMaaS chatbot compared to ChatGPT. Participants using both systems (n = 116) reported greater trust, higher perceived privacy and less experienced hallucinations with their university's customized LLMaaS chatbot in contrast to ChatGPT. We discuss theoretical implications for research on calibrated trust, and offer guidance on the design and deployment of LLMaaS chatbots.
