Table of Contents
Fetching ...

Detection of tortured phrases in scientific literature

Eléna Martel, Martin Lentschat, Cyril Labbé

TL;DR

The paper tackles the problem of detecting tortured phrases—altered but meaningful scientific expressions—introduced by paraphrasing tools and spinners. It compares embedding-based similarity/distance methods using GloVe with a masked-token prediction approach using SciBERT, evaluating both token-level and noun-chunk level classifications. The best-performing approach propagates token-level predictions to noun chunks, achieving recall ~0.87 but precision ~0.61, suggesting useful screening with expert validation. This work provides a dataset and methodology to flag undocumented tortured phrases, aiding literature screening and potential retractions in problematic papers.

Abstract

This paper presents various automatic detection methods to extract so called tortured phrases from scientific papers. These tortured phrases, e.g. flag to clamor instead of signal to noise, are the results of paraphrasing tools used to escape plagiarism detection. We built a dataset and evaluated several strategies to flag previously undocumented tortured phrases. The proposed and tested methods are based on language models and either on embeddings similarities or on predictions of masked token. We found that an approach using token prediction and that propagates the scores to the chunk level gives the best results. With a recall value of .87 and a precision value of .61, it could retrieve new tortured phrases to be submitted to domain experts for validation.

Detection of tortured phrases in scientific literature

TL;DR

The paper tackles the problem of detecting tortured phrases—altered but meaningful scientific expressions—introduced by paraphrasing tools and spinners. It compares embedding-based similarity/distance methods using GloVe with a masked-token prediction approach using SciBERT, evaluating both token-level and noun-chunk level classifications. The best-performing approach propagates token-level predictions to noun chunks, achieving recall ~0.87 but precision ~0.61, suggesting useful screening with expert validation. This work provides a dataset and methodology to flag undocumented tortured phrases, aiding literature screening and potential retractions in problematic papers.

Abstract

This paper presents various automatic detection methods to extract so called tortured phrases from scientific papers. These tortured phrases, e.g. flag to clamor instead of signal to noise, are the results of paraphrasing tools used to escape plagiarism detection. We built a dataset and evaluated several strategies to flag previously undocumented tortured phrases. The proposed and tested methods are based on language models and either on embeddings similarities or on predictions of masked token. We found that an approach using token prediction and that propagates the scores to the chunk level gives the best results. With a recall value of .87 and a precision value of .61, it could retrieve new tortured phrases to be submitted to domain experts for validation.
Paper Structure (7 sections, 1 figure, 4 tables)

This paper contains 7 sections, 1 figure, 4 tables.

Figures (1)

  • Figure 1: Cosine similarity using minimum aggregation