Table of Contents
Fetching ...

Planned Event Forecasting using Future Mentions and Related Entity Extraction in News Articles

Neelesh Kumar Shukla, Pranay Sanghvi

TL;DR

The paper tackles forecasting planned civil unrest events by analyzing news articles to identify future mentions. It presents a pipeline that combines seed-word based filtering with Word2Vec, LDA topic modeling, and Named Entity Recognition to extract event attributes, augmented by a novel Related Entity Extraction that links involved entities via relation patterns and window-based proximity. The approach yields a-priori relevant documents with a precision of 85% and a related-entity extraction accuracy of 87%, demonstrating the feasibility of predicting planned protests from open-source news in a geographically independent manner. The work enables proactive administrative planning and suggests future enhancements like event purpose summarization and sentiment-based intensity assessment to gauge disruption levels.

Abstract

In democracies like India, people are free to express their views and demands. Sometimes this causes situations of civil unrest such as protests, rallies, and marches. These events may be disruptive in nature and are often held without prior permission from the competent authority. Forecasting these events helps administrative officials take necessary action. Usually, protests are announced well in advance to encourage large participation. Therefore, by analyzing such announcements in news articles, planned events can be forecasted beforehand. We developed such a system in this paper to forecast social unrest events using topic modeling and word2vec to filter relevant news articles, and Named Entity Recognition (NER) methods to identify entities such as people, organizations, locations, and dates. Time normalization is applied to convert future date mentions into a standard format. In this paper, we have developed a geographically independent, generalized model to identify key features for filtering civil unrest events. There could be many mentions of entities, but only a few may actually be involved in the event. This paper calls such entities Related Entities and proposes a method to extract them, referred to as Related Entity Extraction.

Planned Event Forecasting using Future Mentions and Related Entity Extraction in News Articles

TL;DR

The paper tackles forecasting planned civil unrest events by analyzing news articles to identify future mentions. It presents a pipeline that combines seed-word based filtering with Word2Vec, LDA topic modeling, and Named Entity Recognition to extract event attributes, augmented by a novel Related Entity Extraction that links involved entities via relation patterns and window-based proximity. The approach yields a-priori relevant documents with a precision of 85% and a related-entity extraction accuracy of 87%, demonstrating the feasibility of predicting planned protests from open-source news in a geographically independent manner. The work enables proactive administrative planning and suggests future enhancements like event purpose summarization and sentiment-based intensity assessment to gauge disruption levels.

Abstract

In democracies like India, people are free to express their views and demands. Sometimes this causes situations of civil unrest such as protests, rallies, and marches. These events may be disruptive in nature and are often held without prior permission from the competent authority. Forecasting these events helps administrative officials take necessary action. Usually, protests are announced well in advance to encourage large participation. Therefore, by analyzing such announcements in news articles, planned events can be forecasted beforehand. We developed such a system in this paper to forecast social unrest events using topic modeling and word2vec to filter relevant news articles, and Named Entity Recognition (NER) methods to identify entities such as people, organizations, locations, and dates. Time normalization is applied to convert future date mentions into a standard format. In this paper, we have developed a geographically independent, generalized model to identify key features for filtering civil unrest events. There could be many mentions of entities, but only a few may actually be involved in the event. This paper calls such entities Related Entities and proposes a method to extract them, referred to as Related Entity Extraction.

Paper Structure

This paper contains 11 sections, 2 figures.

Figures (2)

  • Figure 1: Overall flow of our system
  • Figure :