TaMPERing with Large Language Models: A Field Guide for using Generative AI in Public Administration Research
Michael Overton, Barrie Robison, Lucas Sheneman
TL;DR
The paper introduces TaMPER, a structured framework for integrating Large Language Models into public administration research to address transparency, reproducibility, and replicability challenges. It details five decision points—Task, Model, Prompt, Evaluation, and Reporting—and provides concrete guidance on task definition, model selection, prompt design, evaluation protocols, and reporting standards, including a SBP (Sample-Benchmark-Population) protocol. The framework emphasizes methodological rigor, adaptability to evolving AI capabilities, and openness through detailed documentation and true structured outputs. By enabling robust, transparent, and ethically conscious use of LLMs, TaMPER aims to elevate the rigor and impact of PA scholarship in the era of Generative AI.
Abstract
The integration of Large Language Models (LLMs) into social science research presents transformative opportunities for advancing scientific inquiry, particularly in public administration (PA). However, the absence of standardized methodologies for using LLMs poses significant challenges for ensuring transparency, reproducibility, and replicability. This manuscript introduces the TaMPER framework-a structured methodology organized around five critical decision points: Task, Model, Prompt, Evaluation, and Reporting. The TaMPER framework provides scholars with a systematic approach to leveraging LLMs effectively while addressing key challenges such as model variability, prompt design, evaluation protocols, and transparent reporting practices.
