Table of Contents
Fetching ...

Graph Neural Networks on Discriminative Graphs of Words

Yassine Abbahaddou, Johannes F. Lutzeyer, Michalis Vazirgiannis

TL;DR

This work explores a new Discriminative Graph of Words Graph Neural Network (DGoW-GNN) approach encapsulating both a novel discriminative graph construction and model to classify text, and proposes a new model for the graph-based classification of text, which combines a GNN and a sequence model.

Abstract

In light of the recent success of Graph Neural Networks (GNNs) and their ability to perform inference on complex data structures, many studies apply GNNs to the task of text classification. In most previous methods, a heterogeneous graph, containing both word and document nodes, is constructed using the entire corpus and a GNN is used to classify document nodes. In this work, we explore a new Discriminative Graph of Words Graph Neural Network (DGoW-GNN) approach encapsulating both a novel discriminative graph construction and model to classify text. In our graph construction, containing only word nodes and no document nodes, we split the training corpus into disconnected subgraphs according to their labels and weight edges by the pointwise mutual information of the represented words. Our graph construction, for which we provide theoretical motivation, allows us to reformulate the task of text classification as the task of walk classification. We also propose a new model for the graph-based classification of text, which combines a GNN and a sequence model. We evaluate our approach on seven benchmark datasets and find that it is outperformed by several state-of-the-art baseline models. We analyse reasons for this performance difference and hypothesise under which conditions it is likely to change.

Graph Neural Networks on Discriminative Graphs of Words

TL;DR

This work explores a new Discriminative Graph of Words Graph Neural Network (DGoW-GNN) approach encapsulating both a novel discriminative graph construction and model to classify text, and proposes a new model for the graph-based classification of text, which combines a GNN and a sequence model.

Abstract

In light of the recent success of Graph Neural Networks (GNNs) and their ability to perform inference on complex data structures, many studies apply GNNs to the task of text classification. In most previous methods, a heterogeneous graph, containing both word and document nodes, is constructed using the entire corpus and a GNN is used to classify document nodes. In this work, we explore a new Discriminative Graph of Words Graph Neural Network (DGoW-GNN) approach encapsulating both a novel discriminative graph construction and model to classify text. In our graph construction, containing only word nodes and no document nodes, we split the training corpus into disconnected subgraphs according to their labels and weight edges by the pointwise mutual information of the represented words. Our graph construction, for which we provide theoretical motivation, allows us to reformulate the task of text classification as the task of walk classification. We also propose a new model for the graph-based classification of text, which combines a GNN and a sequence model. We evaluate our approach on seven benchmark datasets and find that it is outperformed by several state-of-the-art baseline models. We analyse reasons for this performance difference and hypothesise under which conditions it is likely to change.

Paper Structure

This paper contains 22 sections, 1 theorem, 11 equations, 3 figures, 11 tables.

Key Result

Theorem 3.3

chung1997spectral For a graph with $Q$ connected components the spectral node embeddings, produced by the normalised Laplacian eigenvectors corresponding to the smallest normalised Laplacian eigenvalue, are indicator vectors establishing the connected component membership of vertices.

Figures (3)

  • Figure 1: Illustration of the two configurations MGoW and DGoW for a toy example of 6 classes in the dataset. The coloured boxes represent different classes. In the MGoW, we merge all the sentence into one corpus and construct one graph of words. In DGoW, we keep the label based split, and create one disconnected subgraph per class.
  • Figure 2: The architecture of DGoW-GNN. Our model takes as input the Discriminative graph of words $\mathcal{C}_p$, a class $p$ and a sentence $s$. The words occurring in $s$ are distinguished by the color red. As noticed, in the inductive setting, the sentence representation in $\mathcal{C}_p$ is not necessary a walk. We use the GNN $g_\theta$ to a vector representations of all the nodes in the graph. We select then only the vectors of the words occurring in $s$ and we fed it to a Bi-LSTM $r_\phi$ to contextualize the representations. At the end, we use a aggregation function $f_\psi$ to output a value in $[0,1].$
  • Figure 3: Class-wise accuracy of the DGoW-GNN on the OH (a) and R8 (b) datasets.

Theorems & Definitions (3)

  • Definition 3.1: Walks, Connected Components and Disconnected Subgraphs
  • Definition 3.2: Discriminiative Graph of Words
  • Theorem 3.3