Investigating Societal Biases in a Poetry Composition System
Emily Sheng, David Uthus
TL;DR
This work addresses societal biases in a creative NLP task—retrieving next-verse suggestions in a poetry composition system. It introduces a bias-mitigation pipeline based on sentiment-style transfer, including a labeled poetry sentiment dataset and a BERT sentiment classifier to drive data augmentation via the DRG approach. By augmenting the next-verse training data with sentiment-shifted verses, the dual-encoder retrieval model is encouraged to retrieve less negative, more positive verses while preserving content. Results show modest yet consistent improvements in sentiment of retrieved verses with comparable relevance and usability, demonstrating the potential of style-transfer augmentation to reduce bias in retrieval for creative language tasks.
Abstract
There is a growing collection of work analyzing and mitigating societal biases in language understanding, generation, and retrieval tasks, though examining biases in creative tasks remains underexplored. Creative language applications are meant for direct interaction with users, so it is important to quantify and mitigate societal biases in these applications. We introduce a novel study on a pipeline to mitigate societal biases when retrieving next verse suggestions in a poetry composition system. Our results suggest that data augmentation through sentiment style transfer has potential for mitigating societal biases.
