Table of Contents
Fetching ...

Graph-Dictionary Signal Model for Sparse Representations of Multivariate Data

William Cappelletti, Pascal Frossard

TL;DR

GraphDict introduces a Graph-Dictionary signal model that represents multivariate data as a sparse combination of graph atoms, where instantaneous graphs are $L_t = \sum_k \delta_{tk} L_{w_k}$ and signals follow $\mathbf{x}_t = k(\mathbf{L}_t) \boldsymbol{\eta}_t$. The learning problem is cast as MAP over atom weights and coefficients and is solved with a novel Bilinear Primal-Dual Splitting (BiPDS) algorithm that handles the bilinear graph operator $\mathbf{L}(\boldsymbol{\Delta}, \mathbf{W})$ and its adjoints. The framework yields practical variants (GraphDictLog and GraphDictSpectral) that are evaluated on synthetic tasks and a motor-imagery EEG task, where sparse graph-atom coefficients provide superior edge-recovery and competitive brain-state classification with far fewer features. The results demonstrate that GraphDict delivers interpretable, sparse representations of evolving relational structure in multivariate data and offers a pathway to integrate domain priors and time-varying graphs in graph signal processing applications.

Abstract

Representing and exploiting multivariate signals requires capturing relations between variables, which we can represent by graphs. Graph dictionaries allow to describe complex relational information as a sparse sum of simpler structures, but no prior model exists to infer such underlying structure elements from data. We define a novel Graph-Dictionary signal model, where a finite set of graphs characterizes relationships in data distribution as filters on the weighted sum of their Laplacians. We propose a framework to infer the graph dictionary representation from observed node signals, which allows to include a priori knowledge about signal properties, and about underlying graphs and their coefficients. We introduce a bilinear generalization of the primal-dual splitting algorithm to solve the learning problem. We show the capability of our method to reconstruct graphs from signals in multiple synthetic settings, where our model outperforms popular baselines. Then, we exploit graph-dictionary representations in an illustrative motor imagery decoding task on brain activity data, where we classify imagined motion better than standard methods relying on many more features. Our graph-dictionary model bridges a gap between sparse representations of multivariate data and a structured decomposition of sample-varying relationships into a sparse combination of elementary graph atoms.

Graph-Dictionary Signal Model for Sparse Representations of Multivariate Data

TL;DR

GraphDict introduces a Graph-Dictionary signal model that represents multivariate data as a sparse combination of graph atoms, where instantaneous graphs are and signals follow . The learning problem is cast as MAP over atom weights and coefficients and is solved with a novel Bilinear Primal-Dual Splitting (BiPDS) algorithm that handles the bilinear graph operator and its adjoints. The framework yields practical variants (GraphDictLog and GraphDictSpectral) that are evaluated on synthetic tasks and a motor-imagery EEG task, where sparse graph-atom coefficients provide superior edge-recovery and competitive brain-state classification with far fewer features. The results demonstrate that GraphDict delivers interpretable, sparse representations of evolving relational structure in multivariate data and offers a pathway to integrate domain priors and time-varying graphs in graph signal processing applications.

Abstract

Representing and exploiting multivariate signals requires capturing relations between variables, which we can represent by graphs. Graph dictionaries allow to describe complex relational information as a sparse sum of simpler structures, but no prior model exists to infer such underlying structure elements from data. We define a novel Graph-Dictionary signal model, where a finite set of graphs characterizes relationships in data distribution as filters on the weighted sum of their Laplacians. We propose a framework to infer the graph dictionary representation from observed node signals, which allows to include a priori knowledge about signal properties, and about underlying graphs and their coefficients. We introduce a bilinear generalization of the primal-dual splitting algorithm to solve the learning problem. We show the capability of our method to reconstruct graphs from signals in multiple synthetic settings, where our model outperforms popular baselines. Then, we exploit graph-dictionary representations in an illustrative motor imagery decoding task on brain activity data, where we classify imagined motion better than standard methods relying on many more features. Our graph-dictionary model bridges a gap between sparse representations of multivariate data and a structured decomposition of sample-varying relationships into a sparse combination of elementary graph atoms.

Paper Structure

This paper contains 26 sections, 34 equations, 5 figures, 1 table, 1 algorithm.

Figures (5)

  • Figure 1: Representation of our novel graph-dictionary signal model. Solid black arrows illustrate the generative process, where coefficients $\bm \delta_t$ mix atoms $\mathcal{G}_1, \dots, \mathcal{G}_K$ from the dictionary to give instantaneous graphs $G_t$, whose Laplacians $L_t$ characterize the signal model from which signals $\bm x_t$ are sampled. In particular, the thick arrows follow the generation of $\bm x_1$. Along blue lines we see the inverse problem, which jointly learns atoms and coefficients from observed signals using our Bilinear Primal Dual Splitting algorithm.
  • Figure 2: Samples of coefficient matrices for different superposition values Each block represents in black the positive coefficients of five atoms over 50 samples.
  • Figure 3: Test performance on edge recovery of instantaneous graphs from signals, measured by Matthews correlation coefficient (MCC), precision and recall. Results are averaged over five random seeds.
  • Figure 4: Distribution of test scores for motor imagery classification from brain state features, computed by leave-one-subject-out cross-validation. On the y-axis we find the name of the model used for learning brain states, together with the number of clusters, or atoms, defining such states.
  • Figure 5: Atoms and 50 corresponding coefficients learned by GraphDictLog on EEG signals from motor imagery data. Atoms are sorted by frequency of appearance and show edges between electrodes. The coefficient matrix is color coded from zero, in white, to one, in black, and has atoms as rows and sample indices as columns.