Sustainable Visions: Unsupervised Machine Learning Insights on Global Development Goals
Alberto García-Rodríguez, Matias Núñez, Miguel Robles Pérez, Tzipe Govezensky, Rafael A. Barrio, Carlos Gershenson, Kimmo K. Kaski, Julia Tagüeña
TL;DR
This study tackles the slow global progress toward the UN Sustainable Development Goals by applying a three-stage unsupervised learning pipeline (PCA for global structure, t-SNE for local structure, and DBSCAN for clustering) to SDG indicators from 2000–2022 across 107 countries. It reveals strong inter-goal correlations (notably between Goals 12 and 13, and negative associations with many other goals) and region-specific SDG patterns that cluster countries geographically, highlighting the insufficiency of a uniform global path toward 2030. The results show a persistent gap to the ideal SDG state, accelerated distortions due to COVID-19, and Gaussian-like intra-cluster distance distributions, suggesting that regionally tailored, data-informed policy is necessary. Overall, the work provides a robust, data-driven framework for diagnosing interdependencies, mapping trajectories, and guiding cooperative, targeted strategies for sustainable progress.
Abstract
The 2030 Agenda for Sustainable Development of the United Nations outlines 17 goals for countries of the world to address global challenges in their development. However, the progress of countries towards these goal has been slower than expected and, consequently, there is a need to investigate the reasons behind this fact. In this study, we have used a novel data-driven methodology to analyze time-series data for over 20 years (2000-2022) from 107 countries using unsupervised machine learning (ML) techniques. Our analysis reveals strong positive and negative correlations between certain SDGs (Sustainable Development Goals). Our findings show that progress toward the SDGs is heavily influenced by geographical, cultural and socioeconomic factors, with no country on track to achieve all the goals by 2030. This highlights the need for a region-specific, systemic approach to sustainable development that acknowledges the complex interdependencies between the goals and the variable capacities of countries to reach them. For this our machine learning based approach provides a robust framework for developing efficient and data-informed strategies to promote cooperative and targeted initiatives for sustainable progress.
