AdaptEval: Evaluating Large Language Models on Domain Adaptation for Text Summarization

Anum Afzal; Ribin Chalumattu; Florian Matthes; Laura Mascarell

AdaptEval: Evaluating Large Language Models on Domain Adaptation for Text Summarization

Anum Afzal, Ribin Chalumattu, Florian Matthes, Laura Mascarell

TL;DR

This work evaluates the domain adaptation abilities of a wide range of LLMs on the summarization task across various domains in both fine-tuning and in-context learning settings and presents AdaptEval, the first domain adaptation evaluation suite.

Abstract

Despite the advances in the abstractive summarization task using Large Language Models (LLM), there is a lack of research that asses their abilities to easily adapt to different domains. We evaluate the domain adaptation abilities of a wide range of LLMs on the summarization task across various domains in both fine-tuning and in-context learning settings. We also present AdaptEval, the first domain adaptation evaluation suite. AdaptEval includes a domain benchmark and a set of metrics to facilitate the analysis of domain adaptation. Our results demonstrate that LLMs exhibit comparable performance in the in-context learning setting, regardless of their parameter scale.

AdaptEval: Evaluating Large Language Models on Domain Adaptation for Text Summarization

TL;DR

Abstract

Paper Structure (24 sections, 9 tables)

This paper contains 24 sections, 9 tables.

Introduction
The Domain Adaptation Suite
Domains Benchmark
Science
Medical
Government
Evaluation Metrics
Domain Vocabulary Overlap (DVO)
Domain Token Distribution Shift
Reference-free evaluation with GPT-4
Domain Adaptation Task
Models Selection
Results
Manual Evaluation
Related Work
...and 9 more sections

AdaptEval: Evaluating Large Language Models on Domain Adaptation for Text Summarization

TL;DR

Abstract

AdaptEval: Evaluating Large Language Models on Domain Adaptation for Text Summarization

Authors

TL;DR

Abstract

Table of Contents