Table of Contents
Fetching ...

SiBBlInGS: Similarity-driven Building-Block Inference using Graphs across States

Noga Mudrik, Gal Mishne, Adam S. Charles

TL;DR

SiBBlInGS tackles the challenge of learning interpretable, sparse Building Blocks (BBs) from multi-state time-series data by jointly inferring BB compositions and per-trial temporal activity while explicitly modeling cross-state variability. It introduces channel graphs $\bm{H}^d$ and a state graph $\bm{P}$ to enforce intra- and inter-state regularities, and supports varying trial lengths and missing data through a flexible dictionary-learning objective with cross-state similarity controlled by $\bm{\nu}$. The method demonstrates accurate BB recovery and meaningful temporal traces on synthetic data and real-world datasets (Google Trends, neural recordings, epilepsy EEG), outperforming a suite of baselines and showing robustness to noise and incomplete data. Together, these advances enable probing state-specific versus background ensembles and their evolution across states, with broad applicability to complex scientific time-series. The work provides open-source code for reproducibility and highlights avenues for extending the framework to Poisson data, nonlinear dynamics, missing channels, and directional BB interactions.

Abstract

Time series data across scientific domains are often collected under distinct states (e.g., tasks), wherein latent processes (e.g., biological factors) create complex inter- and intra-state variability. A key approach to capture this complexity is to uncover fundamental interpretable units within the data, Building Blocks (BBs), which modulate their activity and adjust their structure across observations. Existing methods for identifying BBs in multi-way data often overlook inter- vs. intra-state variability, produce uninterpretable components, or do not align with properties of real-world data, such as missing samples and sessions of different duration. Here, we present a framework for Similarity-driven Building Block Inference using Graphs across States (SiBBlInGS). SiBBlInGS offers a graph-based dictionary learning approach for discovering sparse BBs along with their temporal traces, based on co-activity patterns and inter- vs. intra-state relationships. Moreover, SiBBlInGS captures per-trial temporal variability and controlled cross-state structural BB adaptations, identifies state-specific vs. state-invariant components, and accommodates variability in the number and duration of observed sessions across states. We demonstrate SiBBlInGS's ability to reveal insights into complex phenomena as well as its robustness to noise and missing samples through several synthetic and real-world examples, including web search and neural data.

SiBBlInGS: Similarity-driven Building-Block Inference using Graphs across States

TL;DR

SiBBlInGS tackles the challenge of learning interpretable, sparse Building Blocks (BBs) from multi-state time-series data by jointly inferring BB compositions and per-trial temporal activity while explicitly modeling cross-state variability. It introduces channel graphs and a state graph to enforce intra- and inter-state regularities, and supports varying trial lengths and missing data through a flexible dictionary-learning objective with cross-state similarity controlled by . The method demonstrates accurate BB recovery and meaningful temporal traces on synthetic data and real-world datasets (Google Trends, neural recordings, epilepsy EEG), outperforming a suite of baselines and showing robustness to noise and incomplete data. Together, these advances enable probing state-specific versus background ensembles and their evolution across states, with broad applicability to complex scientific time-series. The work provides open-source code for reproducibility and highlights avenues for extending the framework to Poisson data, nonlinear dynamics, missing channels, and directional BB interactions.

Abstract

Time series data across scientific domains are often collected under distinct states (e.g., tasks), wherein latent processes (e.g., biological factors) create complex inter- and intra-state variability. A key approach to capture this complexity is to uncover fundamental interpretable units within the data, Building Blocks (BBs), which modulate their activity and adjust their structure across observations. Existing methods for identifying BBs in multi-way data often overlook inter- vs. intra-state variability, produce uninterpretable components, or do not align with properties of real-world data, such as missing samples and sessions of different duration. Here, we present a framework for Similarity-driven Building Block Inference using Graphs across States (SiBBlInGS). SiBBlInGS offers a graph-based dictionary learning approach for discovering sparse BBs along with their temporal traces, based on co-activity patterns and inter- vs. intra-state relationships. Moreover, SiBBlInGS captures per-trial temporal variability and controlled cross-state structural BB adaptations, identifies state-specific vs. state-invariant components, and accommodates variability in the number and duration of observed sessions across states. We demonstrate SiBBlInGS's ability to reveal insights into complex phenomena as well as its robustness to noise and missing samples through several synthetic and real-world examples, including web search and neural data.
Paper Structure (37 sections, 12 equations, 13 figures, 2 tables, 1 algorithm)

This paper contains 37 sections, 12 equations, 13 figures, 2 tables, 1 algorithm.

Figures (13)

  • Figure 1: SiBBlInGS SchematicA SiBBlInGS adapts to real-world datasets with varying session durations, sampling rates, and state-specific data by learning interpretable graph-driven hidden patterns and their temporal activity. B SiBBlInGS is based on a per-state-and-trial matrix factorization where the BBs ($\bm{A}^d$) are identical across trials and similar across states. C SiBBlInGS controls the BB similarity via data-driven channel graphs ($\bm{H}^d \in \mathbb{R}^{N \times N }$) and a state similarity graph ($\bm{P} \in \mathbb{R}^{D \times D}$), which can be either predefined (supervised) or data-driven. D The learning schematic with an exemplary trial for each of the 3 exemplary states. The BBs of each state $d$ (columns of $\textbf{A}^d$) are constrained with two regularization terms: 1) state-specific $\bm{\lambda}^d$ captures similar activity between channels by leveraging the channel-similarity graph $\bm{H}^d$, and 2) $\bm{P}$, captures BB consistency across states via the state similarity graph. $\bm{\nu}$ controls the relative level of cross-state similarity between BBs, allowing the discovery of both background and state-specific BBs. Higher (lower) $\bm{\nu}$ values promote greater (lesser) consistency of specific BBs across states (e.g. $\bm{\nu}_1$ v.s $\bm{\nu}_5$).
  • Figure 2: Synthetic data results.A Three example time traces identified by SiBBlInGS vs. ground truth traces, projected into the three synthetic states. SiBBlInGS recovers both traces that are highly correlated with specific states (e.g., $\bm{\Phi}_{10}$; green), as well as traces that exhibit similar activation across states (e.g., $\bm{\Phi}_{2}$; blue). B Comparison between the identified example BBs and the ground-truth BBs. C Correlation between the example identified time traces and the ground truth (left), and Jaccard index of the identified BBs compared to the ground truth (right). D Comparison between the ground-truth data (top), SiBBlInGS reconstruction (middle), and the residual data (bottom). E Comparison to baseline methods (Sec. \ref{['sec:exp']}, App. \ref{['sec:comp_parafac']}). F Performance under noise and random initializations (300 repetitions). Each dot is a model instance. The curve shows the median values, and the shading corresponds to the 25%-75% percentiles. While SiBBlInGS remains robust under varying noise ($\sigma_{\textrm{signal}} / \sigma_{\textrm{noise}} >3$), it experiences a phase transition at a specific noise level, aligning with the dictionary-learning literature (e.g. studer2012dictionary). G Performance with increasing levels of missing samples (200 repeats). The scattered dots represent model repetitions, the curves depict the median values calculated by rounding to the nearest 5%, and the background shading corresponds to 25%-75% percentiles.
  • Figure 3: Demonstration on Google Trends Data.A The BBs' temporal traces, as SiBBlInGS found, demonstrate seasonal trends consistent with the terms associated with each BB. B Standard deviation of temporal traces over time for the different states align with variability in the states' demographics (Sec. \ref{['sec:exp']}). C The BBs SiBBlInGS identified along with their per-state dominancy produce more meaningful clusters than baselines (Fig. \ref{['fig:trends_comparison']}). States are marked by colors; dot sizes represent the contribution of a term in the BB.
  • Figure 4: Identification of Temporal Patterns in Monkey Somatosensory Cortex.A The reaching out task (andrea_colins_rodriguez_2023). B Sparse clusters of neurons representing the identified BBs. C Confusion matrix of a multi-class logistic regression model using the inferred temporal traces to predict the state label. D The BBs' temporal traces as they vary across states and time. E Ratios of within-to-between states temporal correlations for each BB, with $\frac{\rho_{\textrm{within}}}{\rho_{\textrm{between}}} > 1$, indicating states distinguishability.
  • Figure 5: Emerging local BBs in Epilepsy. The recovered BBs under 1) normal activity, 2) activity during the 8 seconds proceedings CPS seizures located around the F8 area, and 3) activity during the seizures. Colors represent different BBs, and the size of the dots corresponds to the contribution of the respective electrode to each BB.
  • ...and 8 more figures