From Global to Local: Learning Context-Aware Graph Representations for Document Classification and Summarization

Ruangrin Ldallitsakool; Margarita Bugueño; Gerard de Melo

From Global to Local: Learning Context-Aware Graph Representations for Document Classification and Summarization

Ruangrin Ldallitsakool, Margarita Bugueño, Gerard de Melo

TL;DR

The dynamic sliding-window attention module is leverage to effectively capture local and mid-range semantic dependencies between sentences, as well as structural relations within documents, to automatically construct graph-based document representations.

Abstract

This paper proposes a data-driven method to automatically construct graph-based document representations. Building upon the recent work of Bugueño and de Melo (2025), we leverage the dynamic sliding-window attention module to effectively capture local and mid-range semantic dependencies between sentences, as well as structural relations within documents. Graph Attention Networks (GATs) trained on our learned graphs achieve competitive results on document classification while requiring lower computational resources than previous approaches. We further present an exploratory evaluation of the proposed graph construction method for extractive document summarization, highlighting both its potential and current limitations. The implementation of this project can be found on GitHub.

From Global to Local: Learning Context-Aware Graph Representations for Document Classification and Summarization

TL;DR

Abstract

Paper Structure (31 sections, 4 equations, 7 figures, 16 tables)

This paper contains 31 sections, 4 equations, 7 figures, 16 tables.

Introduction
Related Work
Attention Mechanisms
Traditional Graph-Based Representations
Combining Graph and LMs
Learned Graph Representations
Method
Sliding Window Attention Models
Graph Construction and Filtering
Experiments
Datasets
Baselines
Results
Classification Performance
Graph Structural Analysis
...and 16 more sections

Figures (7)

Figure 1: Attention matrix from a random document in the HND dataset paired with graphs constructed according to different statistical filtering: non-filtered (Left), mean-bound filter (Middle), and max-bound filter (Right). Brighter colors illustrate higher values.
Figure 2: Overview of our experiment pipeline. The experimental components are illustrated in blue.
Figure 3: A sample of attention matrices from a randomly selected document in the GovReport dataset. From Top to Bottom: conventional Softmax, Annealing Softmax, ReLU, and Sigmoid. Brighter colors illustrate higher values.
Figure 4: The figure shows the distribution of the summary sentences in relative positions (%) of the documents in the test split. Predicted summary sentences are shown in purple, and oracle summary sentences in green.
Figure 5: The Figure shows the occurrences of summary sentences from random documents in the test split. Note that the sentence positions are absolute positions with padded length to the 1000th position.
...and 2 more figures

From Global to Local: Learning Context-Aware Graph Representations for Document Classification and Summarization

TL;DR

Abstract

From Global to Local: Learning Context-Aware Graph Representations for Document Classification and Summarization

Authors

TL;DR

Abstract

Table of Contents

Figures (7)