Table of Contents
Fetching ...

NLP Case Study on Predicting the Before and After of the Ukraine-Russia and Hamas-Israel Conflicts

Jordan Miner, John E. Ortega

TL;DR

It is shown that there is a noticeable difference in social media discussion leading up to and following a conflict and social media discourse on platforms like Twitter and Reddit is useful in identifying future conflicts before they arise.

Abstract

We propose a method to predict toxicity and other textual attributes through the use of natural language processing (NLP) techniques for two recent events: the Ukraine-Russia and Hamas-Israel conflicts. This article provides a basis for exploration in future conflicts with hopes to mitigate risk through the analysis of social media before and after a conflict begins. Our work compiles several datasets from Twitter and Reddit for both conflicts in a before and after separation with an aim of predicting a future state of social media for avoidance. More specifically, we show that: (1) there is a noticeable difference in social media discussion leading up to and following a conflict and (2) social media discourse on platforms like Twitter and Reddit is useful in identifying future conflicts before they arise. Our results show that through the use of advanced NLP techniques (both supervised and unsupervised) toxicity and other attributes about language before and after a conflict is predictable with a low error of nearly 1.2 percent for both conflicts.

NLP Case Study on Predicting the Before and After of the Ukraine-Russia and Hamas-Israel Conflicts

TL;DR

It is shown that there is a noticeable difference in social media discussion leading up to and following a conflict and social media discourse on platforms like Twitter and Reddit is useful in identifying future conflicts before they arise.

Abstract

We propose a method to predict toxicity and other textual attributes through the use of natural language processing (NLP) techniques for two recent events: the Ukraine-Russia and Hamas-Israel conflicts. This article provides a basis for exploration in future conflicts with hopes to mitigate risk through the analysis of social media before and after a conflict begins. Our work compiles several datasets from Twitter and Reddit for both conflicts in a before and after separation with an aim of predicting a future state of social media for avoidance. More specifically, we show that: (1) there is a noticeable difference in social media discussion leading up to and following a conflict and (2) social media discourse on platforms like Twitter and Reddit is useful in identifying future conflicts before they arise. Our results show that through the use of advanced NLP techniques (both supervised and unsupervised) toxicity and other attributes about language before and after a conflict is predictable with a low error of nearly 1.2 percent for both conflicts.
Paper Structure (16 sections, 4 figures, 3 tables)

This paper contains 16 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Ukraine--Russia minimum, maximum, average and total toxicity of topics created with Latent Dirichlet Allocation
  • Figure 3: Prediction capability with Linear Regression on both conflicts using actual (x-axis) vs. predicted (y-axis) toxicity.
  • Figure 4: Prediction capability with BERT on both conflicts using actual (x-axis) vs. predicted (y-axis) toxicity.
  • Figure 5: Accuracy thresholds for Ukraine-Russia conflict.