Table of Contents
Fetching ...

Spatio-temporal Multivariate Cluster Evolution Analysis for Detecting and Tracking Climate Impacts

Warren L. Davis, Max Carlson, Irina Tezaur, Diana Bull, Kara Peterson, Laura Swiler

Abstract

Recent years have seen a growing concern about climate change and its impacts. While Earth System Models (ESMs) can be invaluable tools for studying the impacts of climate change, the complex coupling processes encoded in ESMs and the large amounts of data produced by these models, together with the high internal variability of the Earth system, can obscure important source-to-impact relationships. This paper presents a novel and efficient unsupervised data-driven approach for detecting statistically-significant impacts and tracing spatio-temporal source-impact pathways in the climate through a unique combination of ideas from anomaly detection, clustering and Natural Language Processing (NLP). Using as an exemplar the 1991 eruption of Mount Pinatubo in the Philippines, we demonstrate that the proposed approach is capable of detecting known post-eruption impacts/events. We additionally describe a methodology for extracting meaningful sequences of post-eruption impacts/events by using NLP to efficiently mine frequent multivariate cluster evolutions, which can be used to confirm or discover the chain of physical processes between a climate source and its impact(s).

Spatio-temporal Multivariate Cluster Evolution Analysis for Detecting and Tracking Climate Impacts

Abstract

Recent years have seen a growing concern about climate change and its impacts. While Earth System Models (ESMs) can be invaluable tools for studying the impacts of climate change, the complex coupling processes encoded in ESMs and the large amounts of data produced by these models, together with the high internal variability of the Earth system, can obscure important source-to-impact relationships. This paper presents a novel and efficient unsupervised data-driven approach for detecting statistically-significant impacts and tracing spatio-temporal source-impact pathways in the climate through a unique combination of ideas from anomaly detection, clustering and Natural Language Processing (NLP). Using as an exemplar the 1991 eruption of Mount Pinatubo in the Philippines, we demonstrate that the proposed approach is capable of detecting known post-eruption impacts/events. We additionally describe a methodology for extracting meaningful sequences of post-eruption impacts/events by using NLP to efficiently mine frequent multivariate cluster evolutions, which can be used to confirm or discover the chain of physical processes between a climate source and its impact(s).

Paper Structure

This paper contains 20 sections, 2 equations, 13 figures, 5 tables.

Figures (13)

  • Figure 1: Variables in each analysis partition $p_i$ (left panel) are converted to a reduced rank/dimension signature $s_i$ (right panel).
  • Figure 2: Comparison of cluster stability estimates for T050 and AEROD_v when using mean or percentile(5) signatures. For both variables, percentile(5) signatures result in better stability in the range of $k$ we are interested in. The cluster stability estimate is the averaged near-optimal ARI, discussed in Section \ref{['sec:clustering_methods']}.
  • Figure 3: Comparison of how different partition sizes impact cluster stability for T050. Since partition size does not seem to have much of an impact on stability, it can be chosen to maximize data reduction while minimizing loss of spatial precision, which for this work is a $3\times3$ analysis partition. The cluster stability estimate is the averaged near-optimal ARI, discussed in Section \ref{['sec:clustering_methods']}.
  • Figure 4: Cluster stability estimates for stratospheric heating pathway variables T050, FLNT, and AEROD_v. Candidate optimal $k$ values are chosen as the peaks in stability when $k>3$. Multiple choices of $k$ meet our selection criteria and the results can be seen in Table \ref{['tab:optimal_param_vals_single']}. The cluster stability estimate is the averaged near-optimal ARI, discussed in Section \ref{['sec:clustering_methods']}.
  • Figure 5: Examples of instantaneous cluster counts for a clustering of the AEROD_v variable in a single E3SM simulation with total of four clusters.
  • ...and 8 more figures