More Discriminative Sentence Embeddings via Semantic Graph Smoothing
Chakib Fettal, Lazhar Labiod, Mohamed Nadif
TL;DR
The paper introduces semantic graph smoothing to obtain more discriminative sentence embeddings in an unsupervised setting by smoothing pretrained embeddings over a cosine similarity based $k$-NN graph using polynomial graph filters. By applying four propagation schemes (SGC, S²GC, APPNP, DGC) and tuning a small set of hyperparameters, the approach improves clustering and classification performance across eight benchmark datasets. Empirical results show systematic gains and statistical significance over filterless baselines, highlighting the utility of graph signal processing for enhancing text representations without labels. The method offers a practical and scalable way to boost both unsupervised and supervised document categorization tasks, with clear tradeoffs in computation and parameter tuning.
Abstract
This paper explores an empirical approach to learn more discriminantive sentence representations in an unsupervised fashion. Leveraging semantic graph smoothing, we enhance sentence embeddings obtained from pretrained models to improve results for the text clustering and classification tasks. Our method, validated on eight benchmarks, demonstrates consistent improvements, showcasing the potential of semantic graph smoothing in improving sentence embeddings for the supervised and unsupervised document categorization tasks.
