Introducing Spotlight: A Novel Approach for Generating Captivating Key Information from Documents
Ankan Mullick, Sombit Bose, Rounak Saha, Ayan Kumar Bhowmick, Aditya Vempaty, Prasenjit Dey, Ravi Kokku, Pawan Goyal, Niloy Ganguly
TL;DR
Spotlight redefines information condensation by producing self-contained, engaging mini-narratives that highlight the most intriguing aspects of a document while preserving fidelity. A two-stage pipeline—supervised fine-tuning on ground-truth spotlights followed by Direct Preference Optimization—drives high-quality spotlight generation, outperforming traditional baselines and prompting-focused prompts across four diverse datasets. The approach yields spotlights with improved readability, focused information distribution, and stronger reader engagement, while maintaining alignment with source content. This work opens pathways to aspect-based, query-focused, and personalized spotlights, with future work addressing multilingual and multimodal extensions.
Abstract
In this paper, we introduce Spotlight, a novel paradigm for information extraction that produces concise, engaging narratives by highlighting the most compelling aspects of a document. Unlike traditional summaries, which prioritize comprehensive coverage, spotlights selectively emphasize intriguing content to foster deeper reader engagement with the source material. We formally differentiate spotlights from related constructs and support our analysis with a detailed benchmarking study using new datasets curated for this work. To generate high-quality spotlights, we propose a two-stage approach: fine-tuning a large language model on our benchmark data, followed by alignment via Direct Preference Optimization (DPO). Our comprehensive evaluation demonstrates that the resulting model not only identifies key elements with precision but also enhances readability and boosts the engagement value of the original document.
