Linguistically Conditioned Semantic Textual Similarity
Jingxuan Tu, Keer Xu, Liulu Yue, Bingyang Ye, Kyeongmin Rim, James Pustejovsky
TL;DR
This work targets semantic textual similarity under given conditions (C-STS), identifying substantial annotation errors and ill-defined conditions in existing datasets. The authors reannotate the validation set, show 55% disagreement, and develop a QA-based pipeline that generates condition-focused answers to support error detection and model training. They demonstrate that QA-generated answers correlate more with the reannotations (Spearman up to 55.44) than the original labels and achieve significant performance gains over baselines when training on these answers. Finally, they propose a typed-feature-structure based conditioning scheme to ground conditions linguistically, offering a scalable approach to constructing more precise C-STS data.
Abstract
Semantic textual similarity (STS) is a fundamental NLP task that measures the semantic similarity between a pair of sentences. In order to reduce the inherent ambiguity posed from the sentences, a recent work called Conditional STS (C-STS) has been proposed to measure the sentences' similarity conditioned on a certain aspect. Despite the popularity of C-STS, we find that the current C-STS dataset suffers from various issues that could impede proper evaluation on this task. In this paper, we reannotate the C-STS validation set and observe an annotator discrepancy on 55% of the instances resulting from the annotation errors in the original label, ill-defined conditions, and the lack of clarity in the task definition. After a thorough dataset analysis, we improve the C-STS task by leveraging the models' capability to understand the conditions under a QA task setting. With the generated answers, we present an automatic error identification pipeline that is able to identify annotation errors from the C-STS data with over 80% F1 score. We also propose a new method that largely improves the performance over baselines on the C-STS data by training the models with the answers. Finally we discuss the conditionality annotation based on the typed-feature structure (TFS) of entity types. We show in examples that the TFS is able to provide a linguistic foundation for constructing C-STS data with new conditions.
