Machine Learning Techniques for Multifactor Analysis of National Carbon Dioxide Emissions
Wenjia Xie, Jinhui Li, Kai Zong, Luis Seco
TL;DR
This study tackles the challenge of understanding national CO$_2$ emissions by leveraging a dual-machine-learning framework that combines SVR and PCR on a global panel of 62 countries from 1992–2019. Data preprocessing includes standardization and stationarity checks via the Augmented Dickey-Fuller test, while Permutation Importance identifies fossil-fuel consumption, GDP, and population as top drivers. SVR achieves high predictive accuracy ($R^2 \approx 0.9895$, MSE $\approx 0.015$) and PCR provides robust, interpretable estimates ($\overline{R^2} \approx 0.9013$) by addressing multicollinearity through PCA. The results offer a practical framework for policymakers and market participants to forecast emissions, compare national performance against global trends, and target interventions toward energy transition and sustainable development.
Abstract
This paper presents a comprehensive study leveraging Support Vector Machine (SVM) regression and Principal Component Regression (PCR) to analyze carbon dioxide emissions in a global dataset of 62 countries and their dependence on idiosyncratic, country-specific parameters. The objective is to understand the factors contributing to carbon dioxide emissions and identify the most predictive elements. The analysis provides country-specific emission estimates, highlighting diverse national trajectories and pinpointing areas for targeted interventions in climate change mitigation, sustainable development, and the growing carbon credit markets and green finance sector. The study aims to support policymaking with accurate representations of carbon dioxide emissions, offering nuanced information for formulating effective strategies to address climate change while informing initiatives related to carbon trading and environmentally sustainable investments.
