SAFT: Structure-aware Transformers for Textual Interaction Classification
Hongtao Wang, Renchi Yang, Hewen Wang, Haoran Zheng, Jianliang Xu
TL;DR
This work tackles Textual Interaction Classification on bipartite Textual Interaction Networks by marrying pretrained language models with structure-aware transformers. SAFT introduces line graph attention and gated attention units, a proxy token to couple token- and edge-level signals, and fast, SVD-based distance and centrality embeddings to capture local and global topology, complemented by graph sampling for scalability. Empirical results across eight real-world datasets show SAFT consistently outperforming a wide range of baselines, with ablations confirming the contributions of MP, distance embeddings, centrality embeddings, and sampling. The approach offers a practical, scalable framework for TIC that leverages both deep textual semantics and topology-aware features, with broad implications for spam detection, fraud identification, and sentiment analysis in complex networks.
Abstract
Textual interaction networks (TINs) are an omnipresent data structure used to model the interplay between users and items on e-commerce websites, social networks, etc., where each interaction is associated with a text description. Classifying such textual interactions (TIC) finds extensive use in detecting spam reviews in e-commerce, fraudulent transactions in finance, and so on. Existing TIC solutions either (i) fail to capture the rich text semantics due to the use of context-free text embeddings, and/or (ii) disregard the bipartite structure and node heterogeneity of TINs, leading to compromised TIC performance. In this work, we propose SAFT, a new architecture that integrates language- and graph-based modules for the effective fusion of textual and structural semantics in the representation learning of interactions. In particular, line graph attention (LGA)/gated attention units (GAUs) and pretrained language models (PLMs) are capitalized on to model the interaction-level and token-level signals, which are further coupled via the proxy token in an iterative and contextualized fashion. Additionally, an efficient and theoretically-grounded approach is developed to encode the local and global topology information pertaining to interactions into structural embeddings. The resulting embeddings not only inject the structural features underlying TINs into the textual interaction encoding but also facilitate the design of graph sampling strategies. Extensive empirical evaluations on multiple real TIN datasets demonstrate the superiority of SAFT over the state-of-the-art baselines in TIC accuracy.
