Table of Contents
Fetching ...

On the Rigour of Scientific Writing: Criteria, Analysis, and Insights

Joseph James, Chenghao Xiao, Yucheng Li, Chenghua Lin

TL;DR

A bottom-up, data-driven framework to automatically identify and define rigour criteria and assess their relevance in scientific writing is introduced, revealing that framing certainty is crucial for enhancing the perception of scientific rigour, while suggestion certainty and probability uncertainty diminish it.

Abstract

Rigour is crucial for scientific research as it ensures the reproducibility and validity of results and findings. Despite its importance, little work exists on modelling rigour computationally, and there is a lack of analysis on whether these criteria can effectively signal or measure the rigour of scientific papers in practice. In this paper, we introduce a bottom-up, data-driven framework to automatically identify and define rigour criteria and assess their relevance in scientific writing. Our framework includes rigour keyword extraction, detailed rigour definition generation, and salient criteria identification. Furthermore, our framework is domain-agnostic and can be tailored to the evaluation of scientific rigour for different areas, accommodating the distinct salient criteria across fields. We conducted comprehensive experiments based on datasets collected from two high impact venues for Machine Learning and NLP (i.e., ICLR and ACL) to demonstrate the effectiveness of our framework in modelling rigour. In addition, we analyse linguistic patterns of rigour, revealing that framing certainty is crucial for enhancing the perception of scientific rigour, while suggestion certainty and probability uncertainty diminish it.

On the Rigour of Scientific Writing: Criteria, Analysis, and Insights

TL;DR

A bottom-up, data-driven framework to automatically identify and define rigour criteria and assess their relevance in scientific writing is introduced, revealing that framing certainty is crucial for enhancing the perception of scientific rigour, while suggestion certainty and probability uncertainty diminish it.

Abstract

Rigour is crucial for scientific research as it ensures the reproducibility and validity of results and findings. Despite its importance, little work exists on modelling rigour computationally, and there is a lack of analysis on whether these criteria can effectively signal or measure the rigour of scientific papers in practice. In this paper, we introduce a bottom-up, data-driven framework to automatically identify and define rigour criteria and assess their relevance in scientific writing. Our framework includes rigour keyword extraction, detailed rigour definition generation, and salient criteria identification. Furthermore, our framework is domain-agnostic and can be tailored to the evaluation of scientific rigour for different areas, accommodating the distinct salient criteria across fields. We conducted comprehensive experiments based on datasets collected from two high impact venues for Machine Learning and NLP (i.e., ICLR and ACL) to demonstrate the effectiveness of our framework in modelling rigour. In addition, we analyse linguistic patterns of rigour, revealing that framing certainty is crucial for enhancing the perception of scientific rigour, while suggestion certainty and probability uncertainty diminish it.
Paper Structure (22 sections, 1 equation, 6 figures, 12 tables)

This paper contains 22 sections, 1 equation, 6 figures, 12 tables.

Figures (6)

  • Figure 1: Illustration of the rigour criteria extraction and assessment framework.
  • Figure 2: Top 30 salient keywords for 4* predicted papers from ICLR and ACL using Mutual Information with candidate rigour keywords highlighted in bold. List of keywords in Table \ref{['tab:features']}.
  • Figure 3: Distribution of similarity score for best rigour criteria set for each of REF, ICLR and ACL datasets
  • Figure 4: Uncertainty scores for rigour criteria present in both ICLR and ACL. Positive values indicate 4* preference while negative values indicate non-4* preference.
  • Figure 5: Top 30 salient keywords for non-4* predicted papers from ICLR and ACL using Mutual Information.
  • ...and 1 more figures