Table of Contents
Fetching ...

Contrastive Multi-graph Learning with Neighbor Hierarchical Sifting for Semi-supervised Text Classification

Wei Ai, Jianbin Li, Ze Wang, Yingying Wei, Tao Meng, Yuntao Shou, Keqin Lib

TL;DR

This work proposes a novel method of contrastive multi-graph learning with neighbor hierarchical sifting for semi-supervised text classification, namely ConNHS, which exploits core features to form a multi-relational text graph, enhancing semantic connections among texts.

Abstract

Graph contrastive learning has been successfully applied in text classification due to its remarkable ability for self-supervised node representation learning. However, explicit graph augmentations may lead to a loss of semantics in the contrastive views. Secondly, existing methods tend to overlook edge features and the varying significance of node features during multi-graph learning. Moreover, the contrastive loss suffer from false negatives. To address these limitations, we propose a novel method of contrastive multi-graph learning with neighbor hierarchical sifting for semi-supervised text classification, namely ConNHS. Specifically, we exploit core features to form a multi-relational text graph, enhancing semantic connections among texts. By separating text graphs, we provide diverse views for contrastive learning. Our approach ensures optimal preservation of the graph information, minimizing data loss and distortion. Then, we separately execute relation-aware propagation and cross-graph attention propagation, which effectively leverages the varying correlations between nodes and edge features while harmonising the information fusion across graphs. Subsequently, we present the neighbor hierarchical sifting loss (NHS) to refine the negative selection. For one thing, following the homophily assumption, NHS masks first-order neighbors of the anchor and positives from being negatives. For another, NHS excludes the high-order neighbors analogous to the anchor based on their similarities. Consequently, it effectively reduces the occurrence of false negatives, preventing the expansion of the distance between similar samples in the embedding space. Our experiments on ThuCNews, SogouNews, 20 Newsgroups, and Ohsumed datasets achieved 95.86\%, 97.52\%, 87.43\%, and 70.65\%, which demonstrates competitive results in semi-supervised text classification.

Contrastive Multi-graph Learning with Neighbor Hierarchical Sifting for Semi-supervised Text Classification

TL;DR

This work proposes a novel method of contrastive multi-graph learning with neighbor hierarchical sifting for semi-supervised text classification, namely ConNHS, which exploits core features to form a multi-relational text graph, enhancing semantic connections among texts.

Abstract

Graph contrastive learning has been successfully applied in text classification due to its remarkable ability for self-supervised node representation learning. However, explicit graph augmentations may lead to a loss of semantics in the contrastive views. Secondly, existing methods tend to overlook edge features and the varying significance of node features during multi-graph learning. Moreover, the contrastive loss suffer from false negatives. To address these limitations, we propose a novel method of contrastive multi-graph learning with neighbor hierarchical sifting for semi-supervised text classification, namely ConNHS. Specifically, we exploit core features to form a multi-relational text graph, enhancing semantic connections among texts. By separating text graphs, we provide diverse views for contrastive learning. Our approach ensures optimal preservation of the graph information, minimizing data loss and distortion. Then, we separately execute relation-aware propagation and cross-graph attention propagation, which effectively leverages the varying correlations between nodes and edge features while harmonising the information fusion across graphs. Subsequently, we present the neighbor hierarchical sifting loss (NHS) to refine the negative selection. For one thing, following the homophily assumption, NHS masks first-order neighbors of the anchor and positives from being negatives. For another, NHS excludes the high-order neighbors analogous to the anchor based on their similarities. Consequently, it effectively reduces the occurrence of false negatives, preventing the expansion of the distance between similar samples in the embedding space. Our experiments on ThuCNews, SogouNews, 20 Newsgroups, and Ohsumed datasets achieved 95.86\%, 97.52\%, 87.43\%, and 70.65\%, which demonstrates competitive results in semi-supervised text classification.

Paper Structure

This paper contains 25 sections, 24 equations, 6 figures, 5 tables, 1 algorithm.

Figures (6)

  • Figure 1: Flow chart of the proposed ConNHS. Initially, we construct a multi-relational text graph by leveraging inherent core features (titles, keywords, events) to establish semantic connections among texts while encoding textual content as initial node representations. Subsequently, relational separation yields distinct subgraphs, upon which intra-graph and inter-graph propagation are performed to obtain contrastive samples and similarity score matrix. During Contrastive learning with NHS, negative selection is optimized to encourage more explicit cluster boundaries (minimizing intra-class distances while maximizing inter-class distances; distinct colors indicate different clusters). Ultimately, predicted labels are assigned to document nodes via a logical classifier.
  • Figure 2: Definition of negative pairs in different contrastive losses. Figure \ref{['Fig2']} showcases different negative selection definition strategies. Specifically, both NT-Xent and NHS recognize nodes positioned identically across views as positive samples for the anchor. However, NT-Xent designates all remaining nodes as negatives. In contrast, NHS masks first-order neighbors of the anchor document node and the positive nodes based on the graph homophily principle, and also, based on the similarity score matrix of fused node representations, as shown in (b), it excludes those high-order neighbors that exhibit significant similarity to the anchor. To facilitate a more straightforward interpretation, sifted hierarchical neighbors that will not be included in the contrastive learning process are indicated with specific colors in (b).
  • Figure 3: The accuracy with few labels
  • Figure 4: The ConNHS performance under different similarity threshold of core features
  • Figure 5: The ConNHS performance under different minimum association coefficient
  • ...and 1 more figures