Table of Contents
Fetching ...

ATLAS: Improving Lay Summarisation with Attribute-based Control

Zhihao Zhang, Tomas Goldsack, Carolina Scarton, Chenghua Lin

TL;DR

This paper introduces ATLAS, an attribute-based control framework for lay summarisation, addressing the need for audience-specific content and style. ATLAS extends a BART-base model with four controllable attributes (L, R, BG, CWE) encoded as discrete tokens, enabling targeted manipulation of length, readability, background information, and content word entropy. Across combined biomedical datasets (eLife and PLOS), ATLAS outperforms strong baselines on automatic metrics and is preferred in human evaluations, while ablation and controllability analyses confirm the utility and tunability of the attributes. The approach offers a pathway to more flexible and reliable lay summaries for diverse scientific audiences, with potential impact on science communication and public understanding.

Abstract

Lay summarisation aims to produce summaries of scientific articles that are comprehensible to non-expert audiences. However, previous work assumes a one-size-fits-all approach, where the content and style of the produced summary are entirely dependent on the data used to train the model. In practice, audiences with different levels of expertise will have specific needs, impacting what content should appear in a lay summary and how it should be presented. Aiming to address this, we propose ATLAS, a novel abstractive summarisation approach that can control various properties that contribute to the overall "layness" of the generated summary using targeted control attributes. We evaluate ATLAS on a combination of biomedical lay summarisation datasets, where it outperforms state-of-the-art baselines using mainstream summarisation metrics. Additional analyses provided on the discriminatory power and emergent influence of our selected controllable attributes further attest to the effectiveness of our approach.

ATLAS: Improving Lay Summarisation with Attribute-based Control

TL;DR

This paper introduces ATLAS, an attribute-based control framework for lay summarisation, addressing the need for audience-specific content and style. ATLAS extends a BART-base model with four controllable attributes (L, R, BG, CWE) encoded as discrete tokens, enabling targeted manipulation of length, readability, background information, and content word entropy. Across combined biomedical datasets (eLife and PLOS), ATLAS outperforms strong baselines on automatic metrics and is preferred in human evaluations, while ablation and controllability analyses confirm the utility and tunability of the attributes. The approach offers a pathway to more flexible and reliable lay summaries for diverse scientific audiences, with potential impact on science communication and public understanding.

Abstract

Lay summarisation aims to produce summaries of scientific articles that are comprehensible to non-expert audiences. However, previous work assumes a one-size-fits-all approach, where the content and style of the produced summary are entirely dependent on the data used to train the model. In practice, audiences with different levels of expertise will have specific needs, impacting what content should appear in a lay summary and how it should be presented. Aiming to address this, we propose ATLAS, a novel abstractive summarisation approach that can control various properties that contribute to the overall "layness" of the generated summary using targeted control attributes. We evaluate ATLAS on a combination of biomedical lay summarisation datasets, where it outperforms state-of-the-art baselines using mainstream summarisation metrics. Additional analyses provided on the discriminatory power and emergent influence of our selected controllable attributes further attest to the effectiveness of our approach.
Paper Structure (20 sections, 1 equation, 3 figures, 6 tables)

This paper contains 20 sections, 1 equation, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Visualisation of the density distributions of controllable attribute values for each summary type in the combined train split.
  • Figure 2: An case study from the eLife test set comparing summaries generated under highly lay and technical attribute values (with the length attribute being kept constant).
  • Figure 3: An case study from the eLife test set comparing summaries generated under highly lay and technical attribute values (with the length attribute being kept constant).