X-MethaneWet: A Cross-scale Global Wetland Methane Emission Benchmark Dataset for Advancing Science Discovery with AI
Yiming Sun, Shuo Chen, Shengyu Chen, Chonghao Qiu, Licheng Liu, Youmi Oh, Sparkle L. Malone, Gavin McNicol, Qianlai Zhuang, Chris Smith, Yiqun Xie, Xiaowei Jia
TL;DR
The paper addresses the need for a standardized, high-temporal-resolution benchmark for global wetland methane emissions by constructing X-MethaneWet, which combines TEM-MDM physics-based simulations with FLUXNET-CH$_4$ observations at daily scale. It establishes a comprehensive evaluation framework and benchmarks multiple sequential models (LSTM, EA-LSTM, TCN, Transformer variants, Pyraformer) across temporal and spatial extrapolation tasks, while exploring transfer learning from simulations to real observations. The results show that pretraining on simulated data and subsequent fine-tuning generally improves generalization, particularly in data-sparse scenarios, highlighting the potential of integrating physics-based models with AI for improved methane flux predictions. Overall, this work provides a valuable resource and methodological blueprint for AI-driven climate modeling and methane mitigation planning through knowledge-guided learning.
Abstract
Methane (CH$_4$) is the second most powerful greenhouse gas after carbon dioxide and plays a crucial role in climate change due to its high global warming potential. Accurately modeling CH$_4$ fluxes across the globe and at fine temporal scales is essential for understanding its spatial and temporal variability and developing effective mitigation strategies. In this work, we introduce the first-of-its-kind cross-scale global wetland methane benchmark dataset (X-MethaneWet), which synthesizes physics-based model simulation data from TEM-MDM and the real-world observation data from FLUXNET-CH$_4$. This dataset can offer opportunities for improving global wetland CH$_4$ modeling and science discovery with new AI algorithms. To set up AI model baselines for methane flux prediction, we evaluate the performance of various sequential deep learning models on X-MethaneWet. Furthermore, we explore four different transfer learning techniques to leverage simulated data from TEM-MDM to improve the generalization of deep learning models on real-world FLUXNET-CH$_4$ observations. Our extensive experiments demonstrate the effectiveness of these approaches, highlighting their potential for advancing methane emission modeling and contributing to the development of more accurate and scalable AI-driven climate models.
