Uncovering Differences in Persuasive Language in Russian versus English Wikipedia

Bryan Li; Aleksey Panasyuk; Chris Callison-Burch

Uncovering Differences in Persuasive Language in Russian versus English Wikipedia

Bryan Li, Aleksey Panasyuk, Chris Callison-Burch

TL;DR

A large language model (LLM) powered system to identify instances of persuasive language in multilingual texts is developed, and it is demonstrated that HLQs obtain similar performance when posed in either English or Russian.

Abstract

We study how differences in persuasive language across Wikipedia articles, written in either English and Russian, can uncover each culture's distinct perspective on different subjects. We develop a large language model (LLM) powered system to identify instances of persuasive language in multilingual texts. Instead of directly prompting LLMs to detect persuasion, which is subjective and difficult, we propose to reframe the task to instead ask high-level questions (HLQs) which capture different persuasive aspects. Importantly, these HLQs are authored by LLMs themselves. LLMs over-generate a large set of HLQs, which are subsequently filtered to a small set aligned with human labels for the original task. We then apply our approach to a large-scale, bilingual dataset of Wikipedia articles (88K total), using a two-stage identify-then-extract prompting strategy to find instances of persuasion. We quantify the amount of persuasion per article, and explore the differences in persuasion through several experiments on the paired articles. Notably, we generate rankings of articles by persuasion in both languages. These rankings match our intuitions on the culturally-salient subjects; Russian Wikipedia highlights subjects on Ukraine, while English Wikipedia highlights the Middle East. Grouping subjects into larger topics, we find politically-related events contain more persuasion than others. We further demonstrate that HLQs obtain similar performance when posed in either English or Russian. Our methodology enables cross-lingual, cross-cultural understanding at scale, and we release our code, prompts, and data.

Uncovering Differences in Persuasive Language in Russian versus English Wikipedia

TL;DR

Abstract

Paper Structure (41 sections, 1 equation, 12 figures, 8 tables)

This paper contains 41 sections, 1 equation, 12 figures, 8 tables.

Introduction
Related Work
Biases in Wikipedia
Multilingual biases of LLMs
Russian vs Western perspectives
AI-assisted report generation
Task Formulation
Definitions Used
Datasets Used
Selecting a Dataset in Russian and English
Baseline for Persuasion Detection
Approach: Direct Prompting with Definitions
Baseline makes LLMs over-confident
High-Level Questioning (HLQ)
Generating candidate questions for each persuasion technique
...and 26 more sections

Figures (12)

Figure 1: Overview of our approach for persuasion detection. Top: an LLM generates many high-level questions (HLQs), based on its own understanding of persuasion techniques. We then pose these HLQs to articles from an labeled persuasion dataset piskorski2023semeval, then select a subset of 12 questions which are most aligned to the human labels. Bottom: on another dataset, we use HLQs to prompt an LLM to identify-then-extract persuasive spans. This is done over paired Wikipedia articles in Russian and English, facilitating cross-lingual comparison.
Figure 2: A comparison between two prompting approaches to persuasion technique detection. The baseline (left) directly uses the human-authored definitions. However, as these definitions were written for trained human annotators, the LLM misunderstands them and is over-sensitive and over-confident. Our proposed approach (right) instead leverages the LLM to decompose the task itself. Specifically, we elicit HLQs with a separate prompt (see Figure \ref{['fig:hlq_gen']}). Then, we prompt with HLQs instead of definitions.
Figure 3: Depiction of the method to compare persuasive language usage across languages. For each language, we use HLQ prompts monolingually on all articles to extract persuasive text spans (left: en, right: ru). We compare both persuasive count (PC) and persuasive frequency (PF) between the paired articles. For this case study, the Russian article (and its translation ru2en) are more persuasive on '2021 Russian protests'.
Figure 4: The effectiveness of the classifiers after each feature reduction using ANOVA. F1 performance is relatively stable across metrics from 30 to 8 features, and declines afterwards.
Figure 5: A scatterplot where the x and y positions represent the NPF values of Russian and English articles, respectively. The dashed line indicates equal NPF, i.e. the subjects where English and Russian has similar levels of emotional content. The further a point is from this line, the further the paired articles are in their use of persuasive content.
...and 7 more figures

Uncovering Differences in Persuasive Language in Russian versus English Wikipedia

TL;DR

Abstract

Uncovering Differences in Persuasive Language in Russian versus English Wikipedia

Authors

TL;DR

Abstract

Table of Contents

Figures (12)