Two approaches to multiple canonical correlation analysis for repeated measures data

Tomasz Górecki; Mirosław Krzyśko; Felix Gnettner; Piotr Kokoszka

Two approaches to multiple canonical correlation analysis for repeated measures data

Tomasz Górecki, Mirosław Krzyśko, Felix Gnettner, Piotr Kokoszka

TL;DR

This work extends canonical correlation analysis to handle more than two data blocks and to functional data by proposing two general frameworks: multiple kernel CCA (MKCCA) for repeated measures using RKHS embeddings, and multiple functional CCA (MFCCA) for multivariate functional data. It develops both population formulations and regularized sample procedures, deriving consistency rates under weaker conditions such as non-compact cross-covariance operators and dependent observations. The authors show, through two real-data studies—the Polish agricultural dataset and the Global Competitiveness Index— that MFCCA often yields higher generalized canonical correlations and clearer clustering than MKCCA. The contributions provide a unified operator-theoretic view of CCA extensions, informing robust inference for high-dimensional, time-dependent, or functional data and pointing to future work in regularized and sparse variants.

Abstract

In classical canonical correlation analysis (CCA), the goal is to determine the linear transformations of two random vectors into two new random variables that are most strongly correlated. Canonical variables are pairs of these new random variables, while canonical correlations are correlations between these pairs. In this paper, we propose and study two generalizations of this classical method: (1) Instead of two random vectors we study more complex data structures that appear in important applications. In these structures, there are $L$ features, each described by $p_l$ scalars, $1 \le l \le L$. We observe $n$ such objects over $T$ time points. We derive a suitable analog of the CCA for such data. Our approach relies on embeddings into Reproducing Kernel Hilbert Spaces, and covers several related data structures as well. (2) We develop an analogous approach for multidimensional random processes. In this case, the experimental units are multivariate continuous, square-integrable functions over a given interval. These functions are modeled as elements of a Hilbert space, so in this case, we define the multiple functional canonical correlation analysis, MFCCA. We justify our approaches by their application to two data sets and suitable large sample theory. We derive consistency rates for the related transformation and correlation estimators, and show that it is possible to relax two common assumptions on the compactness of the underlying cross-covariance operators and the independence of the data.

Two approaches to multiple canonical correlation analysis for repeated measures data

TL;DR

Abstract

Two approaches to multiple canonical correlation analysis for repeated measures data

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (13)