Table of Contents
Fetching ...

On The Persona-based Summarization of Domain-Specific Documents

Ankan Mullick, Sombit Bose, Rounak Saha, Ayan Kumar Bhowmick, Pawan Goyal, Niloy Ganguly, Prasenjit Dey, Ravi Kokku

TL;DR

The paper tackles persona-based summarization of domain-specific documents, focusing on healthcare to meet diverse information needs. It introduces a pipeline that fine-tunes small foundation LLMs on a healthcare corpus and uses AI-based critiquing (including GPT-4) to evaluate summary quality, aligning with human judgments. The authors create a large persona-generated dataset (Persona-Data) from 1,455 WebMD articles and show that a Llama2-13b model finetuned with QLoRA (L-F-13b) achieves superior performance on traditional metrics and GPT-4 critique compared to baselines, with strong cross-criteria concordance. The approach holds promise for scalable persona-aware summarization across domains such as legal, enterprise, and education, offering cost-effective, domain-tuned summaries and automated evaluation.

Abstract

In an ever-expanding world of domain-specific knowledge, the increasing complexity of consuming, and storing information necessitates the generation of summaries from large information repositories. However, every persona of a domain has different requirements of information and hence their summarization. For example, in the healthcare domain, a persona-based (such as Doctor, Nurse, Patient etc.) approach is imperative to deliver targeted medical information efficiently. Persona-based summarization of domain-specific information by humans is a high cognitive load task and is generally not preferred. The summaries generated by two different humans have high variability and do not scale in cost and subject matter expertise as domains and personas grow. Further, AI-generated summaries using generic Large Language Models (LLMs) may not necessarily offer satisfactory accuracy for different domains unless they have been specifically trained on domain-specific data and can also be very expensive to use in day-to-day operations. Our contribution in this paper is two-fold: 1) We present an approach to efficiently fine-tune a domain-specific small foundation LLM using a healthcare corpus and also show that we can effectively evaluate the summarization quality using AI-based critiquing. 2) We further show that AI-based critiquing has good concordance with Human-based critiquing of the summaries. Hence, such AI-based pipelines to generate domain-specific persona-based summaries can be easily scaled to other domains such as legal, enterprise documents, education etc. in a very efficient and cost-effective manner.

On The Persona-based Summarization of Domain-Specific Documents

TL;DR

The paper tackles persona-based summarization of domain-specific documents, focusing on healthcare to meet diverse information needs. It introduces a pipeline that fine-tunes small foundation LLMs on a healthcare corpus and uses AI-based critiquing (including GPT-4) to evaluate summary quality, aligning with human judgments. The authors create a large persona-generated dataset (Persona-Data) from 1,455 WebMD articles and show that a Llama2-13b model finetuned with QLoRA (L-F-13b) achieves superior performance on traditional metrics and GPT-4 critique compared to baselines, with strong cross-criteria concordance. The approach holds promise for scalable persona-aware summarization across domains such as legal, enterprise, and education, offering cost-effective, domain-tuned summaries and automated evaluation.

Abstract

In an ever-expanding world of domain-specific knowledge, the increasing complexity of consuming, and storing information necessitates the generation of summaries from large information repositories. However, every persona of a domain has different requirements of information and hence their summarization. For example, in the healthcare domain, a persona-based (such as Doctor, Nurse, Patient etc.) approach is imperative to deliver targeted medical information efficiently. Persona-based summarization of domain-specific information by humans is a high cognitive load task and is generally not preferred. The summaries generated by two different humans have high variability and do not scale in cost and subject matter expertise as domains and personas grow. Further, AI-generated summaries using generic Large Language Models (LLMs) may not necessarily offer satisfactory accuracy for different domains unless they have been specifically trained on domain-specific data and can also be very expensive to use in day-to-day operations. Our contribution in this paper is two-fold: 1) We present an approach to efficiently fine-tune a domain-specific small foundation LLM using a healthcare corpus and also show that we can effectively evaluate the summarization quality using AI-based critiquing. 2) We further show that AI-based critiquing has good concordance with Human-based critiquing of the summaries. Hence, such AI-based pipelines to generate domain-specific persona-based summaries can be easily scaled to other domains such as legal, enterprise documents, education etc. in a very efficient and cost-effective manner.
Paper Structure (19 sections, 6 figures, 6 tables)

This paper contains 19 sections, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Different Training Data Sizes
  • Figure 2: Variations of Llama2-13b-Finetune (in %)
  • Figure 3: GPT-4 generated summary better than LLAMA2-13b model generated summary[persona : doctor]
  • Figure 4: LLAMA2-13b model generated summary better than GPT-4 generated summary[persona : doctor]
  • Figure 5: Persona identify experiment example snapshot
  • ...and 1 more figures