Table of Contents
Fetching ...

A Spatio-temporal CP decomposition analysis of New England region in the US

Fatoumata Sanogo

TL;DR

This work tackles spatio-temporal climate data analysis by leveraging a spatio-temporal PCA (STPCA) to initialize CANDECOMP/PARAFAC (CP) tensor decomposition, enabling more identifiable and accurate latent factors. CP decomposition is estimated via Alternating Least Squares (ALS), initialized with STPCA components, and its performance is contrasted with HOSVD and random initializations. A subsequent K-means clustering step on CP factors assesses the coherence of the extracted modes, with silhouette scores showing STPCA-based initialization yields superior cluster separation. Applied to NCAR precipitation and temperature data over New England, the approach improves reconstruction accuracy and reveals more distinct spatio-temporal regimes, offering a practical framework for downscaling and forecasting in regional climate analysis.

Abstract

Spatio temporal data consist of measurement for one or more raster fields such as weather, traffic volume, crime rate, or disease incidents. Advances in modern technology have increased the number of available information for this type of data hence the rise of multidimensional data. In this paper we take advantage of the multidimensional structure of the data but also its temporal and spatial structure. In fact, we will be using the NCAR Climate Data Gateway website which provides data discovery and access services for global and regional climate model data. The daily values of total precipitation (prec), maximum (tmax), and minimum (tmin) temperature are combined to create a multidimensional data called tensor (a multidimensional array). In this paper, we propose a spatio temporal principal component analysis to initialize CP decomposition component. We take full advantage of the spatial and temporal structure of the data in the initialization step for cp component analysis. The performance of our method is tested via comparison with most popular initialization method. We also run a clustering analysis to further show the performance of our analysis.

A Spatio-temporal CP decomposition analysis of New England region in the US

TL;DR

This work tackles spatio-temporal climate data analysis by leveraging a spatio-temporal PCA (STPCA) to initialize CANDECOMP/PARAFAC (CP) tensor decomposition, enabling more identifiable and accurate latent factors. CP decomposition is estimated via Alternating Least Squares (ALS), initialized with STPCA components, and its performance is contrasted with HOSVD and random initializations. A subsequent K-means clustering step on CP factors assesses the coherence of the extracted modes, with silhouette scores showing STPCA-based initialization yields superior cluster separation. Applied to NCAR precipitation and temperature data over New England, the approach improves reconstruction accuracy and reveals more distinct spatio-temporal regimes, offering a practical framework for downscaling and forecasting in regional climate analysis.

Abstract

Spatio temporal data consist of measurement for one or more raster fields such as weather, traffic volume, crime rate, or disease incidents. Advances in modern technology have increased the number of available information for this type of data hence the rise of multidimensional data. In this paper we take advantage of the multidimensional structure of the data but also its temporal and spatial structure. In fact, we will be using the NCAR Climate Data Gateway website which provides data discovery and access services for global and regional climate model data. The daily values of total precipitation (prec), maximum (tmax), and minimum (tmin) temperature are combined to create a multidimensional data called tensor (a multidimensional array). In this paper, we propose a spatio temporal principal component analysis to initialize CP decomposition component. We take full advantage of the spatial and temporal structure of the data in the initialization step for cp component analysis. The performance of our method is tested via comparison with most popular initialization method. We also run a clustering analysis to further show the performance of our analysis.

Paper Structure

This paper contains 14 sections, 4 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Spearman correlation coefficients in fall (Sept-Nov), winter (Dec-Jan), spring (Mar-May), summer (Jun-Aug) (in rows) with respect to the location indicated by the black square tile for prec, tmax and tmin (in columns).
  • Figure 2: Winter (Dec-Jan), spring (Mar-May), summer (Jun-Aug) and fall (Sept-Nov) auto-correlation functions with lags expressed in days (in rows) for prec, tmax and tmin (in columns) at grid cell whose linear index is halfway through the flattened list of cells.
  • Figure 3: CP decomposition of a third order tensor $\mathcal{X} \in \mathbbm{R}^{n\times p \times d}$ from \ref{['CPDecom']}.