Table of Contents
Fetching ...

AI for Climate Finance: Agentic Retrieval and Multi-Step Reasoning for Early Warning System Investments

Saeid Ario Vaghefi, Aymane Hachcham, Veronica Grasso, Jiska Manicus, Nakiete Msemo, Chiara Colesanti Senni, Markus Leippold

TL;DR

This work tackles the challenge of tracking climate-finance flows for Early Warning Systems (EWS) amid heterogeneous MDB reporting. It introduces the EW4All Financial Tracking AI-Assistant, an agent-based retrieval-augmented generation pipeline that fuses multi-modal extraction, grounding, and hierarchical reasoning to classify investments across CREWS Fund pillars and allocate budgets with grounded evidence. Across 25 CREWS Fund documents, the agent-based approach achieves 0.87 accuracy, 0.89 precision, and 0.83 recall, outperforming zero-shot, few-shot, and fine-tuned baselines, and enabling transparent, end-to-end budget reporting. The study also contrasts glass-box (transparent) versus black-box systems and provides a public benchmark dataset and prompts to foster future AI-driven climate-finance transparency, with implications for policy-making and resource allocation in climate resilience investments.

Abstract

Tracking financial investments in climate adaptation is a complex and expertise-intensive task, particularly for Early Warning Systems (EWS), which lack standardized financial reporting across multilateral development banks (MDBs) and funds. To address this challenge, we introduce an LLM-based agentic AI system that integrates contextual retrieval, fine-tuning, and multi-step reasoning to extract relevant financial data, classify investments, and ensure compliance with funding guidelines. Our study focuses on a real-world application: tracking EWS investments in the Climate Risk and Early Warning Systems (CREWS) Fund. We analyze 25 MDB project documents and evaluate multiple AI-driven classification methods, including zero-shot and few-shot learning, fine-tuned transformer-based classifiers, chain-of-thought (CoT) prompting, and an agent-based retrieval-augmented generation (RAG) approach. Our results show that the agent-based RAG approach significantly outperforms other methods, achieving 87\% accuracy, 89\% precision, and 83\% recall. Additionally, we contribute a benchmark dataset and expert-annotated corpus, providing a valuable resource for future research in AI-driven financial tracking and climate finance transparency.

AI for Climate Finance: Agentic Retrieval and Multi-Step Reasoning for Early Warning System Investments

TL;DR

This work tackles the challenge of tracking climate-finance flows for Early Warning Systems (EWS) amid heterogeneous MDB reporting. It introduces the EW4All Financial Tracking AI-Assistant, an agent-based retrieval-augmented generation pipeline that fuses multi-modal extraction, grounding, and hierarchical reasoning to classify investments across CREWS Fund pillars and allocate budgets with grounded evidence. Across 25 CREWS Fund documents, the agent-based approach achieves 0.87 accuracy, 0.89 precision, and 0.83 recall, outperforming zero-shot, few-shot, and fine-tuned baselines, and enabling transparent, end-to-end budget reporting. The study also contrasts glass-box (transparent) versus black-box systems and provides a public benchmark dataset and prompts to foster future AI-driven climate-finance transparency, with implications for policy-making and resource allocation in climate resilience investments.

Abstract

Tracking financial investments in climate adaptation is a complex and expertise-intensive task, particularly for Early Warning Systems (EWS), which lack standardized financial reporting across multilateral development banks (MDBs) and funds. To address this challenge, we introduce an LLM-based agentic AI system that integrates contextual retrieval, fine-tuning, and multi-step reasoning to extract relevant financial data, classify investments, and ensure compliance with funding guidelines. Our study focuses on a real-world application: tracking EWS investments in the Climate Risk and Early Warning Systems (CREWS) Fund. We analyze 25 MDB project documents and evaluate multiple AI-driven classification methods, including zero-shot and few-shot learning, fine-tuned transformer-based classifiers, chain-of-thought (CoT) prompting, and an agent-based retrieval-augmented generation (RAG) approach. Our results show that the agent-based RAG approach significantly outperforms other methods, achieving 87\% accuracy, 89\% precision, and 83\% recall. Additionally, we contribute a benchmark dataset and expert-annotated corpus, providing a valuable resource for future research in AI-driven financial tracking and climate finance transparency.

Paper Structure

This paper contains 53 sections, 25 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: AI-driven financial tracking pipeline for EWS investments. The different steps are: (1) PDF conversion, (2) context retrieval, (3) information storage and collection, (4) iterative sub-query and instruction creation, (5) dowstream task execution (pillar classification and budget allocation).
  • Figure 2: Left: distribution of total-amount accuracy for the 500-document MDB set. Right: share of the macro-averaged F1 obtained by each system on the amount-per-pillar task.
  • Figure 3: Per-document F1 for evidence extraction. Grey bands highlight projects in which budget figures are dispersed across narrative sections rather than formatted tables.
  • Figure 4: Schematic overview of the final analysis report that results from the agent-based pipeline.The workflow comprises three main stages: (i) ingestion of the project document as a PDF; (ii) a modular agent-based processing pipeline that parses text, identifies total funding figures, and classifies expenditures into four predefined EWS “pillars”; and (iii) compilation of an analysis report summarizing the total allocated budget, per-pillar allocation amounts and percentages, and a graphical distribution of funds across pillars.
  • Figure 5: AI-driven financial tracking pipeline for EWS investments. The different steps are: (1) PDF conversion, (2) context retrieval, (3) information storage and collection, (4) iterative sub-query and instruction creation, (5) dowstream task execution (pillar classification and budget allocation).
  • ...and 1 more figures