Graph fission and cross-validation

James Leiner; Aaditya Ramdas

Graph fission and cross-validation

James Leiner, Aaditya Ramdas

TL;DR

A technique called graph fission is introduced which takes in a graph which potentially contains only one observation per node and produces two (or more) independent graphs with the same node/edge set in a way that splits the original graph's information amongst them in any desired proportion.

Abstract

We introduce a technique called graph fission which takes in a graph which potentially contains only one observation per node (whose distribution lies in a known class) and produces two (or more) independent graphs with the same node/edge set in a way that splits the original graph's information amongst them in any desired proportion. Our proposal builds on data fission/thinning, a method that uses external randomization to create independent copies of an unstructured dataset. We extend this idea to the graph setting where there may be latent structure between observations. We demonstrate the utility of this framework via two applications: inference after structural trend estimation on graphs and a model selection procedure we term "graph cross-validation".

Graph fission and cross-validation

TL;DR

Abstract

Paper Structure (23 sections, 5 theorems, 28 equations, 9 figures, 1 algorithm)

This paper contains 23 sections, 5 theorems, 28 equations, 9 figures, 1 algorithm.

Introduction
Contributions.
Paper outline.
Methodology
Decomposition Rules
Structural Trend Estimation on Graphs
Graph Cross-Validation
Gaussian Data with Unknown Variance.
Simulation.
Inference After Trend Estimation
Inference Under \ref{['assumption:conv_close']}
Inference with Nuisance Parameters
Simulations
Application to NYC Taxi Data
Conclusion
...and 8 more sections

Key Result

Lemma 1

For a graph $\mathcal{G}$ and corresponding Laplacian matrix $L$, let $\hat{\beta}$ be the solution of eqn:opt with $k \in \mathbb{N}$ and $D(\beta) :=\lambda \left\lVert\Delta^{(k)} \beta\right\rVert_{1}$. Let $B$ be the output of alg:basis_construction using $\hat{\beta}$, $k$, and $L$ as inputs.

Figures (9)

Figure 1: Graphical illustration of \ref{['fact:data_thinning']}.
Figure 2: Example of synthetic data points and corresponding structural trend solution $\hat{\beta}$ when fit using square loss and a variety of penalties. From left to right: piecewise constant $\left( \left\lVert\Delta^{(1)}\beta\right\rVert_{1}\right)$, linear $\left( \left\lVert\Delta^{(2)}\beta\right\rVert_{1}\right)$, quadratic $\left( \left\lVert\Delta^{(3)}\beta\right\rVert_{1}\right)$, ridge $\left( \left\lVert\Delta^{(1)}\beta\right\rVert_{2}^{2}\right)$, and elastic net$\left( \alpha \left\lVert\Delta^{(1)}\beta\right\rVert_{1} + (1- \alpha) \left\lVert\Delta^{(1)}\beta\right\rVert_{2}^{2}\right)$.
Figure 3: We vary the size of jumps at breakpoints (colors) along with the percentage of active nodes in the graph, and compare graph cross-validation against ordinary cross-validation in each case (with $L_{1}$ penalty, $\sigma^{2} = 1$, $10$ folds). The relative performance of graph cross-validation (dotted) compared to ordinary cross-validation (solid) increases with both the size of jumps and number of breakpoints, indicating that less smooth trends benefit the most from using graph fission to tune $\lambda$.
Figure 4: Examples of confidence intervals (red) constructed from \ref{['thm:robust-ci']} for two example runs. Ground truth (blue) generated from piecewise constant (left) or quadratic (right) bases.
Figure 5: Confidence interval length and coverage constructed for linear trend filtering ($k=1$). Intervals designed using \ref{['thm:robust-ci']} ensure adequate coverage, while naively constructed intervals undercover for target $1- \alpha$ = 0.9. The conservatism of the confidence intervals from \ref{['thm:robust-ci']} are driven by the difference between conservative and anti-conservative estimates of $\sigma$ (rightmost graph).
...and 4 more figures

Theorems & Definitions (11)

Remark 1
Definition 1: joe_conv
Example 1: Gaussian graph
Example 2: Gaussian graph with correlated errors
Example 3: Poisson graph
Remark 2
Lemma 1
Proposition 1
Theorem 1
Corollary 1
...and 1 more

Graph fission and cross-validation

TL;DR

Abstract

Graph fission and cross-validation

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (9)

Theorems & Definitions (11)