Agentic AI for Improving Precision in Identifying Contributions to Sustainable Development Goals
William A. Ingram, Bipasha Banerjee, Edward A. Fox
TL;DR
The paper tackles the problem of imprecise SDG attribution in scholarly retrieval caused by reliance on incidental keyword matches. It introduces an evaluation agent built from small, autoregressive LLMs to assess abstracts for substantive contributions to SDG targets, applied to a large Scopus-derived dataset (20,000 abstracts per SDG). The study demonstrates divergent yet informative classifications across Phi-3.5-mini, Mistral-7B, and Llama-3.2-3B, and proposes a multi-agent ensemble to reconcile differences and improve precision. The approach offers a scalable, context-aware method to refine SDG-related research metrics and institutional reporting, with potential broad impact on benchmarking across universities.
Abstract
As research institutions increasingly commit to supporting the United Nations' Sustainable Development Goals (SDGs), there is a pressing need to accurately assess their research output against these goals. Current approaches, primarily reliant on keyword-based Boolean search queries, conflate incidental keyword matches with genuine contributions, reducing retrieval precision and complicating benchmarking efforts. This study investigates the application of autoregressive Large Language Models (LLMs) as evaluation agents to identify relevant scholarly contributions to SDG targets in scholarly publications. Using a dataset of academic abstracts retrieved via SDG-specific keyword queries, we demonstrate that small, locally-hosted LLMs can differentiate semantically relevant contributions to SDG targets from documents retrieved due to incidental keyword matches, addressing the limitations of traditional methods. By leveraging the contextual understanding of LLMs, this approach provides a scalable framework for improving SDG-related research metrics and informing institutional reporting.
