Evaluating Machine Expertise: How Graduate Students Develop Frameworks for Assessing GenAI Content
Celia Chen, Alex Leitch
TL;DR
The paper addresses how graduate students assess machine-generated expertise in LLM-mediated web interactions. It uses a qualitative, multi-source study of 14 HCIM students to uncover three interrelated frameworks: protection of domain-relevant signals, verification-based trust, and cross-cultural navigation strategies. The findings highlight how professional identity shapes delegation decisions, how verification visibility drives trust, and how international students leverage prior institutional experience to manage AI tools. The work offers design implications for platforms to support users in developing robust evaluation frameworks, enhancing agency and reducing misinformation in AI-mediated environments.
Abstract
This paper examines how graduate students develop frameworks for evaluating machine-generated expertise in web-based interactions with large language models (LLMs). Through a qualitative study combining surveys, LLM interaction transcripts, and in-depth interviews with 14 graduate students, we identify patterns in how these emerging professionals assess and engage with AI-generated content. Our findings reveal that students construct evaluation frameworks shaped by three main factors: professional identity, verification capabilities, and system navigation experience. Rather than uniformly accepting or rejecting LLM outputs, students protect domains central to their professional identities while delegating others--with managers preserving conceptual work, designers safeguarding creative processes, and programmers maintaining control over core technical expertise. These evaluation frameworks are further influenced by students' ability to verify different types of content and their experience navigating complex systems. This research contributes to web science by highlighting emerging human-genAI interaction patterns and suggesting how platforms might better support users in developing effective frameworks for evaluating machine-generated expertise signals in AI-mediated web environments.
