Table of Contents
Fetching ...

Noisy Pairing and Partial Supervision for Stylized Opinion Summarization

Hayate Iso, Xiaolan Wang, Yoshi Suhara

TL;DR

This work defines stylized opinion summarization and introduces Napa, a non-parallel training framework that combines Noisy Pairing and Partial Supervision to generate professionally styled summaries from customer reviews. It constructs the ProSum benchmark by pairing Yelp customer reviews with Michelin professional reviews, and demonstrates that Napa substantially outperforms self-supervised and non-parallel baselines on ProSum and FewSum, while closely approaching supervised upper bounds. The approach relies on creating noisy cross-entity input-output pairs and constraining learning to aligned subsequences via token alignment, with self-supervised pre-training providing foundational summarization capability. The results suggest Napa enables practical stylized summarization in settings where parallel reviews-summary data are scarce, though limitations such as potential hallucinations and alignment errors remain areas for further work.

Abstract

Opinion summarization research has primarily focused on generating summaries reflecting important opinions from customer reviews without paying much attention to the writing style. In this paper, we propose the stylized opinion summarization task, which aims to generate a summary of customer reviews in the desired (e.g., professional) writing style. To tackle the difficulty in collecting customer and professional review pairs, we develop a non-parallel training framework, Noisy Pairing and Partial Supervision (NAPA), which trains a stylized opinion summarization system from non-parallel customer and professional review sets. We create a benchmark ProSum by collecting customer and professional reviews from Yelp and Michelin. Experimental results on ProSum and FewSum demonstrate that our non-parallel training framework consistently improves both automatic and human evaluations, successfully building a stylized opinion summarization model that can generate professionally-written summaries from customer reviews. The code is available at https://github.com/megagonlabs/napa

Noisy Pairing and Partial Supervision for Stylized Opinion Summarization

TL;DR

This work defines stylized opinion summarization and introduces Napa, a non-parallel training framework that combines Noisy Pairing and Partial Supervision to generate professionally styled summaries from customer reviews. It constructs the ProSum benchmark by pairing Yelp customer reviews with Michelin professional reviews, and demonstrates that Napa substantially outperforms self-supervised and non-parallel baselines on ProSum and FewSum, while closely approaching supervised upper bounds. The approach relies on creating noisy cross-entity input-output pairs and constraining learning to aligned subsequences via token alignment, with self-supervised pre-training providing foundational summarization capability. The results suggest Napa enables practical stylized summarization in settings where parallel reviews-summary data are scarce, though limitations such as potential hallucinations and alignment errors remain areas for further work.

Abstract

Opinion summarization research has primarily focused on generating summaries reflecting important opinions from customer reviews without paying much attention to the writing style. In this paper, we propose the stylized opinion summarization task, which aims to generate a summary of customer reviews in the desired (e.g., professional) writing style. To tackle the difficulty in collecting customer and professional review pairs, we develop a non-parallel training framework, Noisy Pairing and Partial Supervision (NAPA), which trains a stylized opinion summarization system from non-parallel customer and professional review sets. We create a benchmark ProSum by collecting customer and professional reviews from Yelp and Michelin. Experimental results on ProSum and FewSum demonstrate that our non-parallel training framework consistently improves both automatic and human evaluations, successfully building a stylized opinion summarization model that can generate professionally-written summaries from customer reviews. The code is available at https://github.com/megagonlabs/napa
Paper Structure (32 sections, 4 equations, 5 figures, 4 tables)

This paper contains 32 sections, 4 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Comparison of conventional and stylized opinion summarization. Given multiple reviews as input, stylized opinion summarization aims to generate a summary in the desired writing style.
  • Figure 2: Overview of our non-parallel training framework, Noisy Pairing and Partial Supervision.
  • Figure 3: Human evaluations of the fluency, relevance, and attractiveness on ProSum.
  • Figure 4: ROUGE-1 F1 score on validation set of ProSum at different training stages. The orange line denotes the model trained with partial supervision (§\ref{['sub:partial_supervision']}), and the green line denotes the model trained without partial supervision.
  • Figure 5: Comparison of summarization quality with and without pre-training. The blue line denotes the model trained in a supervised setting, orange line denotes the model trained with partial supervision and green line denotes the model trained without partial supervision.