SpiderGen: Towards Procedure Generation For Carbon Life Cycle Assessments with Generative AI
Anupama Sitaraman, Bharathan Balaji, Yuvraj Agarwal
TL;DR
SpiderGen tackles the automation of Life Cycle Assessment (LCA) procedure generation by generating Product Category Rule Process Flow Graphs (PCR PFGs) for product categories using zero-shot LLM reasoning, SBERT embeddings, and clustering to create a DAG of upstream, core, and downstream processes. It formalizes G_{pc} as a DAG constrained by lifecycle phases and employs a four-step pipeline to produce generalizable, phase-ordered process graphs. Evaluated against 65 ground-truth PCRs from EPD International, SpiderGen achieves an average F1-score of $65\%$, outperforming one-shot baselines, and demonstrates favorable cost and time benefits (sub-$1 per PFG in under 10 minutes vs >$25{,}000 and 21 person-days). The work further analyzes model choices, sample-product effects, and complexity-driven performance, highlighting significant practical potential for rapid, scalable, and transparent LCA support, while outlining open challenges in boundary definition and real-world deployment.
Abstract
Investigating the effects of climate change and global warming caused by GHG emissions have been a key concern worldwide. These emissions are largely contributed to by the production, use and disposal of consumer products. Thus, it is important to build tools to estimate the environmental impact of consumer goods, an essential part of which is conducting Life Cycle Assessments (LCAs). LCAs specify and account for the appropriate processes involved with the production, use, and disposal of the products. We present SpiderGen, an LLM-based workflow which integrates the taxonomy and methodology of traditional LCA with the reasoning capabilities and world knowledge of LLMs to generate graphical representations of the key procedural information used for LCA, known as Product Category Rules Process Flow Graphs (PCR PFGs). We additionally evaluate the output of SpiderGen by comparing it with 65 real-world LCA documents. We find that SpiderGen provides accurate LCA process information that is either fully correct or has minor errors, achieving an F1-Score of 65% across 10 sample data points, as compared to 53% using a one-shot prompting method. We observe that the remaining errors occur primarily due to differences in detail between LCA documents, as well as differences in the "scope" of which auxiliary processes must also be included. We also demonstrate that SpiderGen performs better than several baselines techniques, such as chain-of-thought prompting and one-shot prompting. Finally, we highlight SpiderGen's potential to reduce the human effort and costs for estimating carbon impact, as it is able to produce LCA process information for less than \$1 USD in under 10 minutes as compared to the status quo LCA, which can cost over \$25000 USD and take up to 21-person days.
