ClimRetrieve: A Benchmarking Dataset for Information Retrieval from Corporate Climate Disclosures

Tobias Schimanski; Jingwei Ni; Roberto Spacey; Nicola Ranger; Markus Leippold

ClimRetrieve: A Benchmarking Dataset for Information Retrieval from Corporate Climate Disclosures

Tobias Schimanski, Jingwei Ni, Roberto Spacey, Nicola Ranger, Markus Leippold

TL;DR

Although it is shown that incorporating expert knowledge works, the critical limitations of embeddings in knowledge-intensive downstream domains like climate change communication are outlined.

Abstract

To handle the vast amounts of qualitative data produced in corporate climate communication, stakeholders increasingly rely on Retrieval Augmented Generation (RAG) systems. However, a significant gap remains in evaluating domain-specific information retrieval - the basis for answer generation. To address this challenge, this work simulates the typical tasks of a sustainability analyst by examining 30 sustainability reports with 16 detailed climate-related questions. As a result, we obtain a dataset with over 8.5K unique question-source-answer pairs labeled by different levels of relevance. Furthermore, we develop a use case with the dataset to investigate the integration of expert knowledge into information retrieval with embeddings. Although we show that incorporating expert knowledge works, we also outline the critical limitations of embeddings in knowledge-intensive downstream domains like climate change communication.

ClimRetrieve: A Benchmarking Dataset for Information Retrieval from Corporate Climate Disclosures

TL;DR

Although it is shown that incorporating expert knowledge works, the critical limitations of embeddings in knowledge-intensive downstream domains like climate change communication are outlined.

Abstract

Paper Structure (17 sections, 2 equations, 17 figures, 8 tables)

This paper contains 17 sections, 2 equations, 17 figures, 8 tables.

Introduction
Background
Data
Investigating Embedding Search
Conclusion
Complexity of Knowledge-Intensive Questions
Definitions and Concepts
Questions
Expert Annotators and Expert Group
Relevance Labels of the Dataset
Relevant Question-Source-Answer Pairs
Report-Level Dataset
Information Retrieval Explanation
Details on the Experimental Setup
Comparing Retrieval with Questions, Definitions and Concepts vs. Explanations
...and 2 more sections

Figures (17)

Figure 1: Overview of the core columns of ClimRetrieve.
Figure 2: Labeling process to obtain the ClimRetrieve dataset.
Figure 3: Results for the different experimental setups (Embeddings = "text-embedding-3-large").
Figure F.1: Distribution of relevance labels over the relevant question-source-answer dataset.
Figure G.2: Similarities of the most similar relevant text part from the question-source-answer pairs with the paragraphs from the report-level dataset.
...and 12 more figures

ClimRetrieve: A Benchmarking Dataset for Information Retrieval from Corporate Climate Disclosures

TL;DR

Abstract

ClimRetrieve: A Benchmarking Dataset for Information Retrieval from Corporate Climate Disclosures

Authors

TL;DR

Abstract

Table of Contents

Figures (17)