Weakly supervised covariance matrices alignment through Stiefel matrices estimation for MEG applications
Antoine Collas, Rémi Flamary, Alexandre Gramfort
TL;DR
This work addresses domain shift in multivariate time series with limited labeled data by modeling signals as linear mixtures with domain-specific mixing matrices. It introduces Mixing model Stiefel Adaptation (MSA), a semi-supervised approach that jointly learns two Stiefel matrices ${\bm{U}}^{\mathcal{S}}, {\bm{U}}^{\mathcal{T}} \in \mathrm{St}(d,q)$ and a predictor, under an optimal transport domain adaptation (OTDA) assumption that pairs source and target variances. By mapping covariance matrices into a Riemannian tangent space on the SPD manifold ${\mathbb{S}_p^{++}}$ and applying log-linear regression/classification on the latent variances, MSA aligns source-target representations while controlling subspace distance via a Grassmann-regularized loss. The method is optimized on the product manifold ${\left({\textup{St}(d,q)}\right)}^2$ with an alternating scheme that updates ${\bm{U}}^{\mathcal{S}}, {\bm{U}}^{\mathcal{T}}, \bm{\beta}, \bm{\pi}$, and demonstrates superior brain-age regression performance on the Cam-CAN MEG dataset under task variations. The results highlight MSA’s robustness to hyperparameters and the importance of jointly leveraging OT alignment, metric learning, and Grassmann regularization for effective covariance-based domain adaptation in neuroscience time-series analysis.
Abstract
This paper introduces a novel domain adaptation technique for time series data, called Mixing model Stiefel Adaptation (MSA), specifically addressing the challenge of limited labeled signals in the target dataset. Leveraging a domain-dependent mixing model and the optimal transport domain adaptation assumption, we exploit abundant unlabeled data in the target domain to ensure effective prediction by establishing pairwise correspondence with equivalent signal variances between domains. Theoretical foundations are laid for identifying crucial Stiefel matrices, essential for recovering underlying signal variances from a Riemannian representation of observed signal covariances. We propose an integrated cost function that simultaneously learns these matrices, pairwise domain relationships, and a predictor, classifier, or regressor, depending on the task. Applied to neuroscience problems, MSA outperforms recent methods in brain-age regression with task variations using magnetoencephalography (MEG) signals from the Cam-CAN dataset.
