Table of Contents
Fetching ...

Adaptive Planning for Multi-Attribute Controllable Summarization with Monte Carlo Tree Search

Sangwon Ryu, Heejin Do, Yunsu Kim, Gary Geunbae Lee, Jungseul Ok

TL;DR

This work tackles multi-attribute controllable summarization by reframing it as adaptive planning via Monte Carlo Tree Search (MCTS). PACO operates with summary-level nodes and actions that adjust single attributes, distinguishing deterministic targets from non-deterministic alignments, and uses a PUCT-based selection with a local reward/feasibility heuristic to guide search. Empirically, PACO achieves robust controllability across MACSum_Dial, MACSum_Doc, and DialogSum, with 1B models rivaling larger baselines and 70B models delivering state-of-the-art control while preserving quality; planning-inference is entirely at test time, without attribute-specific training. The approach offers practical, training-free flexibility for diverse domains, though it incurs higher compute; future work could optimize search efficiency and extend to broader quality dimensions and attribute types.

Abstract

Controllable summarization moves beyond generic outputs toward human-aligned summaries guided by specified attributes. In practice, the interdependence among attributes makes it challenging for language models to satisfy correlated constraints consistently. Moreover, previous approaches often require per-attribute fine-tuning, limiting flexibility across diverse summary attributes. In this paper, we propose adaptive planning for multi-attribute controllable summarization (PACO), a training-free framework that reframes the task as planning the order of sequential attribute control with a customized Monte Carlo Tree Search (MCTS). In PACO, nodes represent summaries, and actions correspond to single-attribute adjustments, enabling progressive refinement of only the attributes requiring further control. This strategy adaptively discovers optimal control orders, ultimately producing summaries that effectively meet all constraints. Extensive experiments across diverse domains and models demonstrate that PACO achieves robust multi-attribute controllability, surpassing both LLM-based self-planning models and fine-tuned baselines. Remarkably, PACO with Llama-3.2-1B rivals the controllability of the much larger Llama-3.3-70B baselines. With larger models, PACO achieves superior control performance, outperforming all competitors.

Adaptive Planning for Multi-Attribute Controllable Summarization with Monte Carlo Tree Search

TL;DR

This work tackles multi-attribute controllable summarization by reframing it as adaptive planning via Monte Carlo Tree Search (MCTS). PACO operates with summary-level nodes and actions that adjust single attributes, distinguishing deterministic targets from non-deterministic alignments, and uses a PUCT-based selection with a local reward/feasibility heuristic to guide search. Empirically, PACO achieves robust controllability across MACSum_Dial, MACSum_Doc, and DialogSum, with 1B models rivaling larger baselines and 70B models delivering state-of-the-art control while preserving quality; planning-inference is entirely at test time, without attribute-specific training. The approach offers practical, training-free flexibility for diverse domains, though it incurs higher compute; future work could optimize search efficiency and extend to broader quality dimensions and attribute types.

Abstract

Controllable summarization moves beyond generic outputs toward human-aligned summaries guided by specified attributes. In practice, the interdependence among attributes makes it challenging for language models to satisfy correlated constraints consistently. Moreover, previous approaches often require per-attribute fine-tuning, limiting flexibility across diverse summary attributes. In this paper, we propose adaptive planning for multi-attribute controllable summarization (PACO), a training-free framework that reframes the task as planning the order of sequential attribute control with a customized Monte Carlo Tree Search (MCTS). In PACO, nodes represent summaries, and actions correspond to single-attribute adjustments, enabling progressive refinement of only the attributes requiring further control. This strategy adaptively discovers optimal control orders, ultimately producing summaries that effectively meet all constraints. Extensive experiments across diverse domains and models demonstrate that PACO achieves robust multi-attribute controllability, surpassing both LLM-based self-planning models and fine-tuned baselines. Remarkably, PACO with Llama-3.2-1B rivals the controllability of the much larger Llama-3.3-70B baselines. With larger models, PACO achieves superior control performance, outperforming all competitors.

Paper Structure

This paper contains 45 sections, 4 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: Summaries consist of multiple attributes. Our goal is to generate outputs that satisfy diverse user-specified constraints simultaneously.
  • Figure 2: Illustration of the MCTS process in PACO. The tree search begins from a summary generated with a prompt that requests control over all attributes, serving as the root node. After all simulations are completed, the node with the highest degree is selected from the entire tree during the decision stage.
  • Figure 3: An example of PACO adjusting a summary through its planning process. The initial summary shows that LLMs struggle with multiple attribute constraints in a single pass. To address this, PACO successfully refines the summary to meet target attributes. $\color{customgreen}\blacksquare$ indicates shifts to speaker-focused content; $\color{customblue}\blacksquare$ highlights removal of unnecessary details to reach the target length. Values beside the reference summary indicate the target attributes, while values beside generated summaries show their measured attribute scores.
  • Figure 4: (a) LLMs often over-control and lack diversity in their self-generated plans, whereas (b) PACO controls only the necessary attributes for each instance. We visualize the top 10 plans for each method.
  • Figure 5: Each bar shows the frequency of attribute control per model size, with repeated attributes in a plan counted once. Percentages indicate relative proportions. Initial refers to the state where all attributes were controlled simultaneously at the beginning.
  • ...and 1 more figures