Developing Story: Case Studies of Generative AI's Use in Journalism

Natalie Grace Brigham; Chongjiu Gao; Tadayoshi Kohno; Franziska Roesner; Niloofar Mireshghallah

Developing Story: Case Studies of Generative AI's Use in Journalism

Natalie Grace Brigham, Chongjiu Gao, Tadayoshi Kohno, Franziska Roesner, Niloofar Mireshghallah

TL;DR

The paper investigates journalist use of large language models by analyzing WildChat conversations to identify prompting behaviors and the degree of human intervention before publishing machine-generated articles. It classifies tasks and stimuli, verifies outputs by matching to published articles, and uses ROUGE-L to quantify overlap, finding a median $0.62$ overlap and a prompt-to-publication span of about $1$ day. It also detects broader dissemination of LLM-generated content with GPTZero and reveals privacy risks from external and private stimuli, including sensitive interviews. The work argues for responsible AI guidelines, improved AI literacy for practitioners, and cross-disciplinary collaboration to steer journalism's co-evolution with AI technology.

Abstract

Journalists are among the many users of large language models (LLMs). To better understand the journalist-AI interactions, we conduct a study of LLM usage by two news agencies through browsing the WildChat dataset, identifying candidate interactions, and verifying them by matching to online published articles. Our analysis uncovers instances where journalists provide sensitive material such as confidential correspondence with sources or articles from other agencies to the LLM as stimuli and prompt it to generate articles, and publish these machine-generated articles with limited intervention (median output-publication ROUGE-L of 0.62). Based on our findings, we call for further research into what constitutes responsible use of AI, and the establishment of clear guidelines and best practices on using LLMs in a journalistic context.

Developing Story: Case Studies of Generative AI's Use in Journalism

TL;DR

overlap and a prompt-to-publication span of about

day. It also detects broader dissemination of LLM-generated content with GPTZero and reveals privacy risks from external and private stimuli, including sensitive interviews. The work argues for responsible AI guidelines, improved AI literacy for practitioners, and cross-disciplinary collaboration to steer journalism's co-evolution with AI technology.

Abstract

Paper Structure (10 sections, 10 figures, 4 tables)

This paper contains 10 sections, 10 figures, 4 tables.

Introduction
Method
Findings
Conclusion
Social Impacts Statement
Extended Related Work
Task Type
Additional Case Study
Stimuli
ROUGE-L

Figures (10)

Figure 1: Case study of a single-turn journalist-LLM interaction for article generation where an external article by another agency is used as the input stimulus. The generated draft is published by the journalist with little modification as the ROUGE-L between the published article we identified and the generation is 0.71. The '[' and ']' symbols denote portions of the text that have been replaced to minimize identifiability.
Figure 2: Case study of a multi-turn article generation using multiple stimuli, including an internal article from the same agency and an interview transcript. The '[' and ']' symbols denote portions of the text that have been replaced to minimize identifiability.
Figure 3: The distribution of different input stimulus material types provided by journalists to LLMs, over the WildChat conversations that are matched to online articles from the two identified news agencies.
Figure 4: Box plots of ROUGE-L scores between outputs from users' article generations prompts and LLM response (left) as well as the LLM-generated article and the corresponding published article (right).
Figure 5: Distribution of days from generation (based on date in WildChat) to publication (based on published article's date) for published articles matched to turns of article generation in WildChat.
...and 5 more figures

Developing Story: Case Studies of Generative AI's Use in Journalism

TL;DR

Abstract

Developing Story: Case Studies of Generative AI's Use in Journalism

Authors

TL;DR

Abstract

Table of Contents

Figures (10)