Entity Linking using LLMs for Automated Product Carbon Footprint Estimation
Steffen Castle, Julian Moreno Schneider, Leonhard Hennig, Georg Rehm
TL;DR
The paper tackles the problem of automating product carbon footprint estimation by linking component-level BOM data to LCA database entries. It introduces a three-module pipeline that leverages a pretrained LLM to generate detailed component-context descriptions, augments this with relevant datasheet content, and uses semantic similarity in a FAISS-based vector store to map to LCA entries (primarily ecoinvent). Empirical results on a limited dataset show that combining LLMs with datasheet context can match or exceed non-expert human performance for top recommendations, while semantic similarity alone underperforms, highlighting the value of language-model context in fine-grained entity linking. The approach promises scalable, less-manual carbon-footprint estimation, with practical impact for industry adoption and regulatory reporting, and points to future work on broader evaluation and web-context integration.
Abstract
Growing concerns about climate change and sustainability are driving manufacturers to take significant steps toward reducing their carbon footprints. For these manufacturers, a first step towards this goal is to identify the environmental impact of the individual components of their products. We propose a system leveraging large language models (LLMs) to automatically map components from manufacturer Bills of Materials (BOMs) to Life Cycle Assessment (LCA) database entries by using LLMs to expand on available component information. Our approach reduces the need for manual data processing, paving the way for more accessible sustainability practices.
