Weakly supervised covariance matrices alignment through Stiefel matrices estimation for MEG applications

Antoine Collas; Rémi Flamary; Alexandre Gramfort

Weakly supervised covariance matrices alignment through Stiefel matrices estimation for MEG applications

Antoine Collas, Rémi Flamary, Alexandre Gramfort

TL;DR

This work addresses domain shift in multivariate time series with limited labeled data by modeling signals as linear mixtures with domain-specific mixing matrices. It introduces Mixing model Stiefel Adaptation (MSA), a semi-supervised approach that jointly learns two Stiefel matrices ${\bm{U}}^{\mathcal{S}}, {\bm{U}}^{\mathcal{T}} \in \mathrm{St}(d,q)$ and a predictor, under an optimal transport domain adaptation (OTDA) assumption that pairs source and target variances. By mapping covariance matrices into a Riemannian tangent space on the SPD manifold ${\mathbb{S}_p^{++}}$ and applying log-linear regression/classification on the latent variances, MSA aligns source-target representations while controlling subspace distance via a Grassmann-regularized loss. The method is optimized on the product manifold ${\left({\textup{St}(d,q)}\right)}^2$ with an alternating scheme that updates ${\bm{U}}^{\mathcal{S}}, {\bm{U}}^{\mathcal{T}}, \bm{\beta}, \bm{\pi}$, and demonstrates superior brain-age regression performance on the Cam-CAN MEG dataset under task variations. The results highlight MSA’s robustness to hyperparameters and the importance of jointly leveraging OT alignment, metric learning, and Grassmann regularization for effective covariance-based domain adaptation in neuroscience time-series analysis.

Abstract

This paper introduces a novel domain adaptation technique for time series data, called Mixing model Stiefel Adaptation (MSA), specifically addressing the challenge of limited labeled signals in the target dataset. Leveraging a domain-dependent mixing model and the optimal transport domain adaptation assumption, we exploit abundant unlabeled data in the target domain to ensure effective prediction by establishing pairwise correspondence with equivalent signal variances between domains. Theoretical foundations are laid for identifying crucial Stiefel matrices, essential for recovering underlying signal variances from a Riemannian representation of observed signal covariances. We propose an integrated cost function that simultaneously learns these matrices, pairwise domain relationships, and a predictor, classifier, or regressor, depending on the task. Applied to neuroscience problems, MSA outperforms recent methods in brain-age regression with task variations using magnetoencephalography (MEG) signals from the Cam-CAN dataset.

Weakly supervised covariance matrices alignment through Stiefel matrices estimation for MEG applications

TL;DR

and a predictor, under an optimal transport domain adaptation (OTDA) assumption that pairs source and target variances. By mapping covariance matrices into a Riemannian tangent space on the SPD manifold

and applying log-linear regression/classification on the latent variances, MSA aligns source-target representations while controlling subspace distance via a Grassmann-regularized loss. The method is optimized on the product manifold

with an alternating scheme that updates

, and demonstrates superior brain-age regression performance on the Cam-CAN MEG dataset under task variations. The results highlight MSA’s robustness to hyperparameters and the importance of jointly leveraging OT alignment, metric learning, and Grassmann regularization for effective covariance-based domain adaptation in neuroscience time-series analysis.

Abstract

Paper Structure (18 sections, 2 theorems, 22 equations, 7 figures, 1 algorithm)

This paper contains 18 sections, 2 theorems, 22 equations, 7 figures, 1 algorithm.

Introduction
Domain shift and covariance matrices under mixing models
Mixing models in a domain adaption context
Covariance matrices in the Riemannian geometry framework
Regression and classification with mixed and domain shifted signals
Joint learning of TEXT, TEXT and TEXT
Learning problem
Reduction to a problem on the Stiefel manifold ${\textup{St}(d,q)}^2$ and optimization
Related work
Numerical experiments
Regression on MEG data
Conclusions
Acknowledgments
Technical details
Proof of Proposition \ref{['prop:projections']}
...and 3 more sections

Key Result

Proposition 2.1

Given source and target embeddings, ${\bm{x}}_i^{\mathcal{S}}$ and ${\bm{x}}_j^{\mathcal{T}}$eq:embedding, following mixing models eq:mixing_model_compact, with $\pi_{ij} > 0$eq:pi, there exist ${\bm{U}}^{\mathcal{S}}, {\bm{U}}^{\mathcal{T}} \in \textup{St}(p^2, q)$ such that where $p_{i,l} \triangleq p_{i,l}^{\mathcal{S}} = p_{j,l}^{\mathcal{T}}$, $\forall l \in \llbracket 1, q \rrbracket$,

Figures (7)

Figure 1: Illustration of MSA for a regression task. Source and target covariance matrices, ${\bm{\Sigma}}_i^{\mathcal{S}}$ and ${\bm{\Sigma}}_i^{\mathcal{T}}$, exhibit different patterns in their original spaces. Embedding them into ${\bm{x}}_i^{\mathcal{S}}$ and ${\bm{x}}_i^{\mathcal{T}}$ and then jointly learning optimal transport plan $\bm{\pi}$ and orthogonal bases ${\bm{U}}^{\mathcal{S}}$ and ${\bm{U}}^{\mathcal{T}}$ alleviate this problem by finding components that matter for prediction.
Figure 2: Mean absolute errors (MAE) of different regressors on the brain age prediction problem of the Cam-CAN dataset (the lower, the better). The $646$ subjects ($p=65$) are split into source and target domains ($323$ subjects each) and are associated with two different tasks. The latter are reported over each subfigure, e.g., the subfigure (A) presents results with the rest for the source task and the passive task for the target. $10\%$ of the target labels are kept during training, and $100$ different data splits are performed.
Figure 3: Mean absolute errors (MAE) of MSA for different values of hyperparameters on the brain age prediction problem of the Cam-CAN dataset (the lower, the better). The source task is rest, and the target task is somatosensory. $10\%$ of the target labels are kept during training, and $100$ different data splits are performed.
Figure 4: Ablation study of the proposed loss \ref{['eq:loss']} on the brain age prediction problem of the Cam-CAN dataset. Mean absolute errors (MAE) of the full loss versus when one of the terms is removed are reported (the lower, the better). The source task is rest, and the target is somatosensory. $10\%$ of the target labels are kept during training, and $100$ different data splits are performed.
Figure 5: Scatter plots of different regressors on the brain age prediction problem of the Cam-CAN dataset.$R^2$ scores are reported for each method on each pair of tasks (the higher, the better).
...and 2 more figures

Theorems & Definitions (2)

Proposition 2.1: Stiefel projections
Corollary 2.2: Realined predictive models

Weakly supervised covariance matrices alignment through Stiefel matrices estimation for MEG applications

TL;DR

Abstract

Weakly supervised covariance matrices alignment through Stiefel matrices estimation for MEG applications

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (2)