Table of Contents
Fetching ...

Leveraging the Power of LLMs: A Fine-Tuning Approach for High-Quality Aspect-Based Summarization

Ankan Mullick, Sombit Bose, Rounak Saha, Ayan Kumar Bhowmick, Aditya Vempaty, Pawan Goyal, Niloy Ganguly, Prasenjit Dey, Ravi Kokku

TL;DR

This work tackles aspect-based summarization (ABS) by fine-tuning open-source foundation LLMs (Llama2, Mistral, Gemma, Aya) on the OASUM ABS dataset. It employs supervised fine-tuning with prompt–completion pairs, leveraging QLoRA and PEFT for efficiency, and compares against strong baselines using traditional metrics and GPT-4 critique. The results show that fine-tuned LLMs, especially Llama2-13b-FT, deliver superior ABS quality across domains and dataset variations, though model-architecture and dataset characteristics influence gains (e.g., Gemma-FT). The study demonstrates the viability of LLM fine-tuning for targeted information extraction and provides a framework for robust, domain-aware ABS with potential extensions to multilingual and multimodal settings.

Abstract

The ever-increasing volume of digital information necessitates efficient methods for users to extract key insights from lengthy documents. Aspect-based summarization offers a targeted approach, generating summaries focused on specific aspects within a document. Despite advancements in aspect-based summarization research, there is a continuous quest for improved model performance. Given that large language models (LLMs) have demonstrated the potential to revolutionize diverse tasks within natural language processing, particularly in the problem of summarization, this paper explores the potential of fine-tuning LLMs for the aspect-based summarization task. We evaluate the impact of fine-tuning open-source foundation LLMs, including Llama2, Mistral, Gemma and Aya, on a publicly available domain-specific aspect based summary dataset. We hypothesize that this approach will enable these models to effectively identify and extract aspect-related information, leading to superior quality aspect-based summaries compared to the state-of-the-art. We establish a comprehensive evaluation framework to compare the performance of fine-tuned LLMs against competing aspect-based summarization methods and vanilla counterparts of the fine-tuned LLMs. Our work contributes to the field of aspect-based summarization by demonstrating the efficacy of fine-tuning LLMs for generating high-quality aspect-based summaries. Furthermore, it opens doors for further exploration of using LLMs for targeted information extraction tasks across various NLP domains.

Leveraging the Power of LLMs: A Fine-Tuning Approach for High-Quality Aspect-Based Summarization

TL;DR

This work tackles aspect-based summarization (ABS) by fine-tuning open-source foundation LLMs (Llama2, Mistral, Gemma, Aya) on the OASUM ABS dataset. It employs supervised fine-tuning with prompt–completion pairs, leveraging QLoRA and PEFT for efficiency, and compares against strong baselines using traditional metrics and GPT-4 critique. The results show that fine-tuned LLMs, especially Llama2-13b-FT, deliver superior ABS quality across domains and dataset variations, though model-architecture and dataset characteristics influence gains (e.g., Gemma-FT). The study demonstrates the viability of LLM fine-tuning for targeted information extraction and provides a framework for robust, domain-aware ABS with potential extensions to multilingual and multimodal settings.

Abstract

The ever-increasing volume of digital information necessitates efficient methods for users to extract key insights from lengthy documents. Aspect-based summarization offers a targeted approach, generating summaries focused on specific aspects within a document. Despite advancements in aspect-based summarization research, there is a continuous quest for improved model performance. Given that large language models (LLMs) have demonstrated the potential to revolutionize diverse tasks within natural language processing, particularly in the problem of summarization, this paper explores the potential of fine-tuning LLMs for the aspect-based summarization task. We evaluate the impact of fine-tuning open-source foundation LLMs, including Llama2, Mistral, Gemma and Aya, on a publicly available domain-specific aspect based summary dataset. We hypothesize that this approach will enable these models to effectively identify and extract aspect-related information, leading to superior quality aspect-based summaries compared to the state-of-the-art. We establish a comprehensive evaluation framework to compare the performance of fine-tuned LLMs against competing aspect-based summarization methods and vanilla counterparts of the fine-tuned LLMs. Our work contributes to the field of aspect-based summarization by demonstrating the efficacy of fine-tuning LLMs for generating high-quality aspect-based summaries. Furthermore, it opens doors for further exploration of using LLMs for targeted information extraction tasks across various NLP domains.
Paper Structure (25 sections, 4 figures, 6 tables)

This paper contains 25 sections, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Comparison between vanilla and fine-tuned versions of different LLMs for Rouge1, Bert-Score F1, Relevance and Coverage
  • Figure 2: GPT4 Criteria Performance comparison of Llama2-13b-FT model w.r.t. training data variation (left) and max-new-token size (right)
  • Figure 3: OASUM summary example snapshot
  • Figure 4: original summary and Llama2-13b finetune comparison experiment example snapshot