Table of Contents
Fetching ...

Automatic ESG Assessment of Companies by Mining and Evaluating Media Coverage Data: NLP Approach and Tool

Jannik Fischbach, Max Adam, Victor Dzhagatspanyan, Daniel Mendez, Julian Frattini, Oleksandr Kosenkov, Parisa Elahidoost

TL;DR

The paper tackles automatic ESG assessment from non-corporate media coverage, addressing biases in company-authored sustainability reports by leveraging real-time news headlines. It introduces ESG-Miner, an end-to-end NLP pipeline that detects company mentions in headlines, filters ESG relevance, assigns headlines to environmental, social, or governance domains, analyzes sentiment, and computes an ESG score. A 432,411-headline corpus is published, and the authors demonstrate the pipeline on 3,000 unseen headlines, achieving a macro-F1 of 0.97 for company detection and strong environmental classification performance, while noting FP/FN propagation as a current bottleneck. The work provides a valuable, open-resource framework for researchers and practitioners to monitor public ESG perception, with potential to support real-time investor and consumer decision-making, albeit with improvements needed in FP/FN control and scoring nuance.

Abstract

Context: Sustainable corporate behavior is increasingly valued by society and impacts corporate reputation and customer trust. Hence, companies regularly publish sustainability reports to shed light on their impact on environmental, social, and governance (ESG) factors. Problem: Sustainability reports are written by companies themselves and are therefore considered a company-controlled source. Contrary, studies reveal that non-corporate channels (e.g., media coverage) represent the main driver for ESG transparency. However, analysing media coverage regarding ESG factors is challenging since (1) the amount of published news articles grows daily, (2) media coverage data does not necessarily deal with an ESG-relevant topic, meaning that it must be carefully filtered, and (3) the majority of media coverage data is unstructured. Research Goal: We aim to extract ESG-relevant information from textual media reactions automatically to calculate an ESG score for a given company. Our goal is to reduce the cost of ESG data collection and make ESG information available to the general public. Contribution: Our contributions are three-fold: First, we publish a corpus of 432,411 news headlines annotated as being environmental-, governance-, social-related, or ESG-irrelevant. Second, we present our tool-supported approach called ESG-Miner capable of analyzing and evaluating headlines on corporate ESG-performance automatically. Third, we demonstrate the feasibility of our approach in an experiment and apply the ESG-Miner on 3000 manually labeled headlines. Our approach processes 96.7 % of the headlines correctly and shows a great performance in detecting environmental-related headlines along with their correct sentiment. We encourage fellow researchers and practitioners to use the ESG-Miner at https://www.esg-miner.com.

Automatic ESG Assessment of Companies by Mining and Evaluating Media Coverage Data: NLP Approach and Tool

TL;DR

The paper tackles automatic ESG assessment from non-corporate media coverage, addressing biases in company-authored sustainability reports by leveraging real-time news headlines. It introduces ESG-Miner, an end-to-end NLP pipeline that detects company mentions in headlines, filters ESG relevance, assigns headlines to environmental, social, or governance domains, analyzes sentiment, and computes an ESG score. A 432,411-headline corpus is published, and the authors demonstrate the pipeline on 3,000 unseen headlines, achieving a macro-F1 of 0.97 for company detection and strong environmental classification performance, while noting FP/FN propagation as a current bottleneck. The work provides a valuable, open-resource framework for researchers and practitioners to monitor public ESG perception, with potential to support real-time investor and consumer decision-making, albeit with improvements needed in FP/FN control and scoring nuance.

Abstract

Context: Sustainable corporate behavior is increasingly valued by society and impacts corporate reputation and customer trust. Hence, companies regularly publish sustainability reports to shed light on their impact on environmental, social, and governance (ESG) factors. Problem: Sustainability reports are written by companies themselves and are therefore considered a company-controlled source. Contrary, studies reveal that non-corporate channels (e.g., media coverage) represent the main driver for ESG transparency. However, analysing media coverage regarding ESG factors is challenging since (1) the amount of published news articles grows daily, (2) media coverage data does not necessarily deal with an ESG-relevant topic, meaning that it must be carefully filtered, and (3) the majority of media coverage data is unstructured. Research Goal: We aim to extract ESG-relevant information from textual media reactions automatically to calculate an ESG score for a given company. Our goal is to reduce the cost of ESG data collection and make ESG information available to the general public. Contribution: Our contributions are three-fold: First, we publish a corpus of 432,411 news headlines annotated as being environmental-, governance-, social-related, or ESG-irrelevant. Second, we present our tool-supported approach called ESG-Miner capable of analyzing and evaluating headlines on corporate ESG-performance automatically. Third, we demonstrate the feasibility of our approach in an experiment and apply the ESG-Miner on 3000 manually labeled headlines. Our approach processes 96.7 % of the headlines correctly and shows a great performance in detecting environmental-related headlines along with their correct sentiment. We encourage fellow researchers and practitioners to use the ESG-Miner at https://www.esg-miner.com.
Paper Structure (11 sections, 2 figures, 1 table)