Table of Contents
Fetching ...

Unfolding Tensors to Identify the Graph in Discrete Latent Bipartite Graphical Models

Yuqi Gu

TL;DR

This work addresses identifiability of the loading graph in discrete latent bipartite graphical models, including Noisy-OR networks and Restricted Boltzmann Machines. It introduces a constructive identifiability framework based on unfolding the population tensor into matrices and leveraging rank properties to recover the graph structure and the number of latent variables $K$, under a sparsity condition that each latent is linked to at least two pure observed variables. The approach is agnostic to edge direction, latent dependence, and nonlinearities, and yields a practical certificate for identifiability that extends beyond tree-structured models. It also analyzes identifiability of other model parameters and outlines future work toward finite-sample algorithms and broader model classes, with implications for interpretable science and trustworthy ML.

Abstract

We use a tensor unfolding technique to prove a new identifiability result for discrete bipartite graphical models, which have a bipartite graph between an observed and a latent layer. This model family includes popular models such as Noisy-Or Bayesian networks for medical diagnosis and Restricted Boltzmann Machines in machine learning. These models are also building blocks for deep generative models. Our result on identifying the graph structure enjoys the following nice properties. First, our identifiability proof is constructive, in which we innovatively unfold the population tensor under the model into matrices and inspect the rank properties of the resulting matrices to uncover the graph. This proof itself gives a population-level structure learning algorithm that outputs both the number of latent variables and the bipartite graph. Second, we allow various forms of nonlinear dependence among the variables, unlike many continuous latent variable graphical models that rely on linearity to show identifiability. Third, our identifiability condition is interpretable, only requiring each latent variable to connect to at least two "pure" observed variables in the bipartite graph. The new result not only brings novel advances in algebraic statistics, but also has useful implications for these models' trustworthy applications in scientific disciplines and interpretable machine learning.

Unfolding Tensors to Identify the Graph in Discrete Latent Bipartite Graphical Models

TL;DR

This work addresses identifiability of the loading graph in discrete latent bipartite graphical models, including Noisy-OR networks and Restricted Boltzmann Machines. It introduces a constructive identifiability framework based on unfolding the population tensor into matrices and leveraging rank properties to recover the graph structure and the number of latent variables , under a sparsity condition that each latent is linked to at least two pure observed variables. The approach is agnostic to edge direction, latent dependence, and nonlinearities, and yields a practical certificate for identifiability that extends beyond tree-structured models. It also analyzes identifiability of other model parameters and outlines future work toward finite-sample algorithms and broader model classes, with implications for interpretable science and trustworthy ML.

Abstract

We use a tensor unfolding technique to prove a new identifiability result for discrete bipartite graphical models, which have a bipartite graph between an observed and a latent layer. This model family includes popular models such as Noisy-Or Bayesian networks for medical diagnosis and Restricted Boltzmann Machines in machine learning. These models are also building blocks for deep generative models. Our result on identifying the graph structure enjoys the following nice properties. First, our identifiability proof is constructive, in which we innovatively unfold the population tensor under the model into matrices and inspect the rank properties of the resulting matrices to uncover the graph. This proof itself gives a population-level structure learning algorithm that outputs both the number of latent variables and the bipartite graph. Second, we allow various forms of nonlinear dependence among the variables, unlike many continuous latent variable graphical models that rely on linearity to show identifiability. Third, our identifiability condition is interpretable, only requiring each latent variable to connect to at least two "pure" observed variables in the bipartite graph. The new result not only brings novel advances in algebraic statistics, but also has useful implications for these models' trustworthy applications in scientific disciplines and interpretable machine learning.
Paper Structure (23 sections, 9 theorems, 82 equations, 2 figures)

This paper contains 23 sections, 9 theorems, 82 equations, 2 figures.

Key Result

Theorem 1

Suppose Assumptions assume-fullrank and assume-graph hold and the $J\times K$ bipartite graph matrix $\mathbf G$ takes the following form after some row permutation: where $\mathbf G^\star$ is an arbitrary binary matrix. Then the bipartite graph $\mathbf G$ is identifiable. Moreover, the number of latent variables $K$ and the form of $\mathbf G$ can both be uniquely recovered using a constructive

Figures (2)

  • Figure 1: Graphical model representations. Grey nodes denote observed variables, and white nodes denote latent variables. (a): QMR-DT network for medical diagnosis illustrated in jaakkola1999variational. (b): Restricted Boltzmann Machine (RBM) in hinton2006reducing for dimension reduction of image data, and in salakhutdinov2007rbm for collaborative filtering.
  • Figure 2: Directed and undirected bipartite graphical models with a latent layer $\mathbf A=(A_1,A_2) \in [H]^2$ and an observed layer $\mathbf Y = (Y_1,\ldots,Y_5) \in [V]^5$. $A_1$ and $A_2$ are marginally independent in (a) and marginally dependent in (b). Theorem \ref{['thm-main']} applies to both cases.

Theorems & Definitions (10)

  • Definition 1
  • Theorem 1
  • Proposition 1: Unfold $\mathcal{T}$ to Reveal All Single-parent/neighbor Structures
  • Proposition 2: Unfold $\mathcal{T}$ to Reveal All Multi-parent/neighbor Structures
  • Corollary 1
  • Proposition 3
  • Lemma 1
  • Lemma 2
  • Lemma 3
  • Lemma 4: Lemma 3.3 in stegeman2007kruskal