Table of Contents
Fetching ...

Semantic Role Labeling Guided Out-of-distribution Detection

Jinan Zou, Maihao Guo, Yu Tian, Yuhao Lin, Haiyao Cao, Lingqiao Liu, Ehsan Abbasnejad, Javen Qinfeng Shi

TL;DR

A new unsupervised OOD detection method is proposed, namely Semantic Role Labeling Guided Out-of-distribution Detection (SRLOOD), that separates, extracts, and learns the semantic role labeling (SRL) guided fine-grained local feature representations from different arguments of a sentence and the global feature representations of the full sentence using a margin-based contrastive loss.

Abstract

Identifying unexpected domain-shifted instances in natural language processing is crucial in real-world applications. Previous works identify the out-of-distribution (OOD) instance by leveraging a single global feature embedding to represent the sentence, which cannot characterize subtle OOD patterns well. Another major challenge current OOD methods face is learning effective low-dimensional sentence representations to identify the hard OOD instances that are semantically similar to the in-distribution (ID) data. In this paper, we propose a new unsupervised OOD detection method, namely Semantic Role Labeling Guided Out-of-distribution Detection (SRLOOD), that separates, extracts, and learns the semantic role labeling (SRL) guided fine-grained local feature representations from different arguments of a sentence and the global feature representations of the full sentence using a margin-based contrastive loss. A novel self-supervised approach is also introduced to enhance such global-local feature learning by predicting the SRL extracted role. The resulting model achieves SOTA performance on four OOD benchmarks, indicating the effectiveness of our approach. The code is publicly accessible via \url{https://github.com/cytai/SRLOOD}.

Semantic Role Labeling Guided Out-of-distribution Detection

TL;DR

A new unsupervised OOD detection method is proposed, namely Semantic Role Labeling Guided Out-of-distribution Detection (SRLOOD), that separates, extracts, and learns the semantic role labeling (SRL) guided fine-grained local feature representations from different arguments of a sentence and the global feature representations of the full sentence using a margin-based contrastive loss.

Abstract

Identifying unexpected domain-shifted instances in natural language processing is crucial in real-world applications. Previous works identify the out-of-distribution (OOD) instance by leveraging a single global feature embedding to represent the sentence, which cannot characterize subtle OOD patterns well. Another major challenge current OOD methods face is learning effective low-dimensional sentence representations to identify the hard OOD instances that are semantically similar to the in-distribution (ID) data. In this paper, we propose a new unsupervised OOD detection method, namely Semantic Role Labeling Guided Out-of-distribution Detection (SRLOOD), that separates, extracts, and learns the semantic role labeling (SRL) guided fine-grained local feature representations from different arguments of a sentence and the global feature representations of the full sentence using a margin-based contrastive loss. A novel self-supervised approach is also introduced to enhance such global-local feature learning by predicting the SRL extracted role. The resulting model achieves SOTA performance on four OOD benchmarks, indicating the effectiveness of our approach. The code is publicly accessible via \url{https://github.com/cytai/SRLOOD}.
Paper Structure (15 sections, 7 equations, 3 figures, 6 tables, 1 algorithm)

This paper contains 15 sections, 7 equations, 3 figures, 6 tables, 1 algorithm.

Figures (3)

  • Figure 1: Model architecture of our framework. The Transformers including the pre-traiend language model and a subsequent encoder are guided by SRL, extracting global and local representations of input sequence according to the semantic roles A0, V, A1, and masking them according to the semantic roles A0, V, A1 to construct the Self-Supervised Module. An additional 3-way Classifiers take the Transformer representation of A0, V or A1 MASKs as input, and predict their semantic roles.
  • Figure 2: Qualitative examples using IMDB as the In-Distribution (ID) dataset, and TREC-10 and WMT16 as the Out-of-Distribution (OOD) datasets.
  • Figure 3: Performance on different masking probabilities on four benchmark datasets.