SpinSVAR: Estimating Structural Vector Autoregression Assuming Sparse Input

Panagiotis Misiakos; Markus Püschel

SpinSVAR: Estimating Structural Vector Autoregression Assuming Sparse Input

Panagiotis Misiakos, Markus Püschel

TL;DR

SpinSVAR targets learning causal structure in time-series by estimating a structural VAR under sparsity in the input shocks. It models the inputs as independent Laplacian variables, yielding a maximum-likelihood estimator based on least absolute error regression and imposes a soft acyclicity regularizer for scalable, GPU-friendly optimization. Theoretical results establish identifiability of the window graph and consistency of the MLE, while experiments show superior accuracy and runtime on synthetic data and meaningful market insights on S&P 500 stocks. The approach demonstrates that assuming sparse structural input can lead to interpretable, data-driven clustering and identification of significant shocks in real-world financial time series.

Abstract

We introduce SpinSVAR, a novel method for estimating a structural vector autoregression (SVAR) from time-series data under sparse input assumption. Unlike prior approaches using Gaussian noise, we model the input as independent Laplacian variables, enforcing sparsity and yielding a maximum likelihood estimator (MLE) based on least absolute error regression. We provide theoretical consistency guarantees for the MLE under mild assumptions. SpinSVAR is efficient: it can leverage GPU acceleration to scale to thousands of nodes. On synthetic data with Laplacian or Bernoulli-uniform inputs, SpinSVAR outperforms state-of-the-art methods in accuracy and runtime. When applied to S&P 500 data, it clusters stocks by sectors and identifies significant structural shocks linked to major price movements, demonstrating the viability of our sparse input assumption.

SpinSVAR: Estimating Structural Vector Autoregression Assuming Sparse Input

TL;DR

Abstract

Paper Structure (135 sections, 12 theorems, 67 equations, 11 figures, 5 tables, 1 algorithm)

This paper contains 135 sections, 12 theorems, 67 equations, 11 figures, 5 tables, 1 algorithm.

Introduction
Structural vector autoregression
Challenges and Limitations
SpinSVAR: Sparse Input SVAR
Contributions
SVAR with Sparse Input
Time-series data
Example: stock market
Model Demonstration
Example
Structural vector autoregression
Example
Sparse input
Example
Laplace Distribution
...and 120 more sections

Key Result

Theorem 3.1

Consider the time-series model (eq:SVAR) with ${\bm{S}}$ following a multivariate Laplace distribution (eq:laplace_model) with $\beta^* > \frac{1}{NTd}$. Then the adjacency matrices ${\bm{B}}_{0},{\bm{B}}_{1},...,{\bm{B}}_{k}\in{\mathbb{R}}^{d\times d}$ and $\beta$ are identifiable from the time-ser

Figures (11)

Figure 1: Visualizing an SVAR (\ref{['eq:SVAR']}) with sparse input ${\bm{\mathsfit{S}}}$. Out of $28$ structural shocks in ${\bm{\mathsfit{S}}}$ only seven are significant (positive or negative) and the rest are approximately zero. The window graph ${\bm{W}}$, composed of ${\bm{B}}_0, {\bm{B}}_1, {\bm{B}}_2$, generates the observed dense time series ${\bm{\mathsfit{X}}}$ (bottom) via (\ref{['eq:SVAR']}).
Figure 2: Synthetic experiments. First row SHD (lower is better), second row runtime. (a), (b) consider $N = 10$ samples of time-series with $T = 1000$ and varying number $d$ of nodes for both input distributions. (c), (d) consider $d = 500$ nodes and varying number of samples $N$ of time-series of length $T = 1000$. Any non-reported point implies a time-out (execution time $> 10.000\text{s}\approx 2\text{:}45$h).
Figure 3: Real experiment on the S&P 500 stock market index. (a) Instantaneous relations $\widehat{{\bm{B}}}_0$ between the $45$ highest weighted stocks within S&P 500, grouped by sectors (squares), and (b) the discovered structural shocks $\widehat{{\bm{S}}}$ for $60$ days. In (a) the direction of influence is from row to column.
Figure 4: Performance on synthetic data (Laplacian distributed input): AUROC ($\uparrow$), F1-score ($\uparrow$) NMSE ($\downarrow$) and structural shocks NMSE ($\downarrow$). (a), (b) correspond to $N= 1$ and $N=10$ samples of time-series with $T=1000$ and varying number of nodes. (c) corresponds to $d=500$ nodes and varying samples $N$ of time-series of length $T=1000$.
Figure 5: Performance on synthetic data (Bernoulli distributed input).
...and 6 more figures

Theorems & Definitions (23)

Theorem 3.1
proof : Proof sketch
Lemma 3.2
Theorem 3.3
proof : Proof sketch
Theorem A.1
proof
Lemma A.2
proof
Lemma A.3
...and 13 more

SpinSVAR: Estimating Structural Vector Autoregression Assuming Sparse Input

TL;DR

Abstract

SpinSVAR: Estimating Structural Vector Autoregression Assuming Sparse Input

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (11)

Theorems & Definitions (23)