Quantitative Insights into Large Language Model Usage and Trust in Academia: An Empirical Study

Minseok Jung; Aurora Zhang; May Fung; Junho Lee; Paul Pu Liang

Quantitative Insights into Large Language Model Usage and Trust in Academia: An Empirical Study

Minseok Jung, Aurora Zhang, May Fung, Junho Lee, Paul Pu Liang

TL;DR

Quantitative study of LLM usage and trust among 125 researchers at a private R1 university addresses the lack of data underpinning AI policies in academia. Using a Qualtrics survey, the authors quantify adoption, trust, and prioritized concerns, revealing widespread usage and a positive association between trust, adoption, and engagement, with fact-checking identified as the top concern. The work argues for data-driven policy development that acknowledges pervasive use while emphasizing verification and accountability, and it discusses strategies to build trust through exposure and responsible use. Limitations include a single-institution sample; future work should broaden scope across disciplines and settings and consider multilingual contexts.

Abstract

Large Language Models (LLMs) are transforming writing, reading, teaching, and knowledge retrieval in many academic fields. However, concerns regarding their misuse and erroneous outputs have led to varying degrees of trust in LLMs within academic communities. In response, various academic organizations have proposed and adopted policies regulating their usage. However, these policies are not based on substantial quantitative evidence because there is no data about use patterns and user opinion. Consequently, there is a pressing need to accurately quantify their usage, user trust in outputs, and concerns about key issues to prioritize in deployment. This study addresses these gaps through a quantitative user study of LLM usage and trust in academic research and education. Specifically, our study surveyed 125 individuals at a private R1 research university regarding their usage of LLMs, their trust in LLM outputs, and key issues to prioritize for robust usage in academia. Our findings reveal: (1) widespread adoption of LLMs, with 75% of respondents actively using them; (2) a significant positive correlation between trust and adoption, as well as between engagement and trust; and (3) that fact-checking is the most critical concern. These findings suggest a need for policies that address pervasive usage, prioritize fact-checking mechanisms, and accurately calibrate user trust levels as they engage with these models. These strategies can help balance innovation with accountability and help integrate LLMs into the academic environment effectively and reliably.

Quantitative Insights into Large Language Model Usage and Trust in Academia: An Empirical Study

TL;DR

Abstract

Paper Structure (55 sections, 4 equations, 8 figures, 2 tables)

This paper contains 55 sections, 4 equations, 8 figures, 2 tables.

Introduction
Related Works
Deployment of large language models in academic environment
Trust in LLMs
Major issues facing LLM deployment
Methodology
Research Questions
Experimental Design
Approval
Study population
Data collection method
Data pre-processing
Data Analysis
Frequency Analysis.
Point-Biserial Correlation Coefficient.
...and 40 more sections

Figures (8)

Figure 1: An overview of the primary topics examined in our user study on language model usage in academia. The figure highlights two main phases as well as additional feedback: usage characteristics and concerns (red), polling to five core issues (green), and additional feedback (blue). The first section (red) answers RQ1-3 and the second section answers RQ4. The additional feedback was included for further development.
Figure 2: Distribution of respondents' weekly use time of language models. The majority of respondents use LLMs and most use them for between 1 and 5 hours per week. Also, 25% of respondents replied that they are not using those tools.
Figure 3: Point-biserial correlation shows the relationship between language model usage ("Do not use" vs. "Use") and trust levels on a Likert scale. Each dot represents the number of respondents at trust levels 1 ("Completely distrust") to 4 ("Mostly trust"), with no responses for level 5 ("Completely trust"). Each dot indicates a reply, and the middle point is the average of each group. The cap indicates the extent of the error bar, which is about the standard deviation above and below the mean. The line connecting mean values for "Do not use" and "Use" groups shows a moderate positive correlation (r = 0.4601), indicating adoption links to higher trust. This correlation is statistically significant (p = 7.316e-06).
Figure 4: A positive moderate correlation coefficient between trust levels and use time is monitored using Kendall’s tau (with a Lowess fit represented by the blue curve). The color indicates different trust levels, with red representing 'Completely distrust', orange for 'Mostly distrust', yellow for 'Neutral', and light green for 'Mostly trust'. No responses indicated 'Completely trust'. The size of the dots represents the number of respondents, and color intensity increases with the amount of use time. The result indicates that users who spend more time using language models tend to have higher trust levels in them.
Figure 5: This figure illustrates the diverse applications of LLMs across various tasks, from research and learning to data analysis, highlighting their versatility.
...and 3 more figures

Quantitative Insights into Large Language Model Usage and Trust in Academia: An Empirical Study

TL;DR

Abstract

Quantitative Insights into Large Language Model Usage and Trust in Academia: An Empirical Study

Authors

TL;DR

Abstract

Table of Contents

Figures (8)