Joint Learning of Local and Global Features for Aspect-based Sentiment Classification
Hao Niu, Yun Xiong, Xiaosu Wang, Philip S. Yu
TL;DR
This work tackles aspect-based sentiment classification by jointly modeling local and global information. It introduces Gaussian mask and covariance self-attention to adaptively capture local context around the given aspect term, and AWIG plus DGAT to exploit long-distance dependency-relations-driven global information. Empirical results on SemEval 2014 and Twitter show state-of-the-art performance and robust ablations confirm the contribution of each component. The approach highlights the value of incorporating dependency relation tags and edge semantics for effective ASC in real-world texts.
Abstract
Aspect-based sentiment classification (ASC) aims to judge the sentiment polarity conveyed by the given aspect term in a sentence. The sentiment polarity is not only determined by the local context but also related to the words far away from the given aspect term. Most recent efforts related to the attention-based models can not sufficiently distinguish which words they should pay more attention to in some cases. Meanwhile, graph-based models are coming into ASC to encode syntactic dependency tree information. But these models do not fully leverage syntactic dependency trees as they neglect to incorporate dependency relation tag information into representation learning effectively. In this paper, we address these problems by effectively modeling the local and global features. Firstly, we design a local encoder containing: a Gaussian mask layer and a covariance self-attention layer. The Gaussian mask layer tends to adjust the receptive field around aspect terms adaptively to deemphasize the effects of unrelated words and pay more attention to local information. The covariance self-attention layer can distinguish the attention weights of different words more obviously. Furthermore, we propose a dual-level graph attention network as a global encoder by fully employing dependency tag information to capture long-distance information effectively. Our model achieves state-of-the-art performance on both SemEval 2014 and Twitter datasets.
