Table of Contents
Fetching ...

Node Classification in Random Trees

Wouter W. L. Nuijten, Vlado Menkovski

TL;DR

The aim is to model a distribution over the node label assignments in settings where the tree data structure is associated with node attributes (typically high dimensional embeddings) and none of the label assignments are present during inference.

Abstract

We propose a method for the classification of objects that are structured as random trees. Our aim is to model a distribution over the node label assignments in settings where the tree data structure is associated with node attributes (typically high dimensional embeddings). The tree topology is not predetermined and none of the label assignments are present during inference. Other methods that produce a distribution over node label assignment in trees (or more generally in graphs) either assume conditional independence of the label assignment, operate on a fixed graph topology, or require part of the node labels to be observed. Our method defines a Markov Network with the corresponding topology of the random tree and an associated Gibbs distribution. We parameterize the Gibbs distribution with a Graph Neural Network that operates on the random tree and the node embeddings. This allows us to estimate the likelihood of node assignments for a given random tree and use MCMC to sample from the distribution of node assignments. We evaluate our method on the tasks of node classification in trees on the Stanford Sentiment Treebank dataset. Our method outperforms the baselines on this dataset, demonstrating its effectiveness for modeling joint distributions of node labels in random trees.

Node Classification in Random Trees

TL;DR

The aim is to model a distribution over the node label assignments in settings where the tree data structure is associated with node attributes (typically high dimensional embeddings) and none of the label assignments are present during inference.

Abstract

We propose a method for the classification of objects that are structured as random trees. Our aim is to model a distribution over the node label assignments in settings where the tree data structure is associated with node attributes (typically high dimensional embeddings). The tree topology is not predetermined and none of the label assignments are present during inference. Other methods that produce a distribution over node label assignment in trees (or more generally in graphs) either assume conditional independence of the label assignment, operate on a fixed graph topology, or require part of the node labels to be observed. Our method defines a Markov Network with the corresponding topology of the random tree and an associated Gibbs distribution. We parameterize the Gibbs distribution with a Graph Neural Network that operates on the random tree and the node embeddings. This allows us to estimate the likelihood of node assignments for a given random tree and use MCMC to sample from the distribution of node assignments. We evaluate our method on the tasks of node classification in trees on the Stanford Sentiment Treebank dataset. Our method outperforms the baselines on this dataset, demonstrating its effectiveness for modeling joint distributions of node labels in random trees.
Paper Structure (14 sections, 5 equations, 3 figures, 1 table)

This paper contains 14 sections, 5 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: Example of the node classification task for sentiment analysis. The node labels and their relation encode the structural composition of sentiment.
  • Figure 2: Different graphs utilized in the construction of our method. In \ref{['fig:input_tree']} we see an instance of the node classification problem, and in \ref{['fig:markov_network']}, we see the associated Markov Network with this instance. A Graph Neural Network takes an instance of the problem and produces high dimensional node embeddings (\ref{['fig:node_embeddings']}). These node embeddings are combined with the Markov Network to form a Conditional Random Field (\ref{['fig:crf']}).
  • Figure 3: Factor graph construction. The node embeddings $e$ determine the node and edge factors $\phi$ through linear Neural Networks. This fully specifies a Gibbs Distribution over the resulting Markov Network. With $d$ being the number of classes a node can have and $|e|$ the size of the node embedding, $f_\theta : \mathbb{R}^{|e| + |e|} \rightarrow \mathbb{R}^{d \times d}$ is a function with trainable parameters $\theta$ that determines the edge factors from the embeddings. $g_\chi : \mathbb{R}^{|e|} \rightarrow \mathbb{R}^{d}$ is a function with trainable parameters $\chi$ that determines the node factors from the embeddings.