The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories

Xi Yu Huang; Krishnapriya Vishnubhotla; Frank Rudzicz

The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories

Xi Yu Huang, Krishnapriya Vishnubhotla, Frank Rudzicz

TL;DR

It is found that generated stories differ significantly from human stories along all six dimensions, and that human and machine generations display similar biases when grouped according to the narrative point-of-view and gender of the main protagonist.

Abstract

The improved generative capabilities of large language models have made them a powerful tool for creative writing and storytelling. It is therefore important to quantitatively understand the nature of generated stories, and how they differ from human storytelling. We augment the Reddit WritingPrompts dataset with short stories generated by GPT-3.5, given the same prompts. We quantify and compare the emotional and descriptive features of storytelling from both generative processes, human and machine, along a set of six dimensions. We find that generated stories differ significantly from human stories along all six dimensions, and that human and machine generations display similar biases when grouped according to the narrative point-of-view and gender of the main protagonist. We release our dataset and code at https://github.com/KristinHuangg/gpt-writing-prompts.

The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories

TL;DR

Abstract

Paper Structure (20 sections, 2 equations, 3 figures, 7 tables)

This paper contains 20 sections, 2 equations, 3 figures, 7 tables.

Introduction
Background
Generative Models of Natural Language
Biases in Stories
The GPT-WritingPrompts Dataset
Generating Artificial Stories
Methodology
Characterizing the Point-of-View
Identifying the protagonist:
Identifying the point-of-view:
Extracting Protagonist Attributes
Dimensions of Entity Portrayal
Converting Attributes to Scores
Evaluation
Analysis
...and 5 more sections

Figures (3)

Figure 1: The proportion of stories that fall under the five inferred point-of-view (PoV) categories for stories written by humans and generated by GPT-3.5, grouped by the inferred PoV of the prompt.
Figure 2: Distribution of z-scored COMET (shaded dark) and spaCy (lightly shaded) attribute scores along each dimension for the different PoV groups, and human and GPT-3.5 generated stories.
Figure 3: Distribution of prompt-wise differences in mean scores (a) between human and gpt-3.5-turbo generations, and (b) compared to a human control group.

The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories

TL;DR

Abstract

The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories

Authors

TL;DR

Abstract

Table of Contents

Figures (3)