Knowing Your Uncertainty -- On the application of LLM in social sciences
Bolun Zhang, Linzhuo Li, Yunqi Chen, Qinlin Zhao, Zihan Zhu, Xiaoyuan Yi, Xing Xie
TL;DR
This paper addresses the critical challenge of uncertainty when applying LLMs to social science research. It introduces a two-dimensional task-validation framework (T,V) to tailor uncertainty quantification to specific tasks and data availability, integrating epistemic and aleatoric sources and treating prompts as a latent space. Through multiple illustrative studies (sentiment analysis, topic labeling, exploratory coding, historical counterfactuals), it demonstrates the utility and limits of various UQ methods and emphasizes that metric choice should be driven by task and validation context. The authors advocate an uncertainty-first workflow, open-source tooling, and careful research design to prevent overclaiming and to ensure rigorous, replicable social-science insights.
Abstract
Large language models (LLMs) are rapidly being integrated into computational social science research, yet their blackboxed training and designed stochastic elements in inference pose unique challenges for scientific inquiry. This article argues that applying LLMs to social scientific tasks requires explicit assessment of uncertainty-an expectation long established in both quantitative methodology in the social sciences and machine learning. We introduce a unified framework for evaluating LLM uncertainty along two dimensions: the task type (T), which distinguishes between classification, short-form, and long-form generation, and the validation type (V), which captures the availability of reference data or evaluative criteria. Drawing from both computer science and social science literature, we map existing uncertainty quantification (UQ) methods to this T-V typology and offer practical recommendations for researchers. Our framework provides both a methodological safeguard and a practical guide for integrating LLMs into rigorous social science research.
