Table of Contents
Fetching ...

Persona-SQ: A Personalized Suggested Question Generation Framework For Real-world Documents

Zihao Lin, Zichao Wang, Yuanting Pan, Varun Manjunatha, Ryan Rossi, Angela Lau, Lifu Huang, Tong Sun

TL;DR

Persona-SQ tackles the uniformity of suggested questions by incorporating synthetic reader profiles (professions and reading goals) into the SQ generation process. It introduces a five-step framework that collects documents, synthesizes and quality-controls personas, generates personalized questions, and verifies them. The authors demonstrate two demonstrations: improving LLM-based SQ generation and training tiny models on a 100k-question synthetic dataset that approach larger models in quality. Across automatic and human evaluations in finance, legal, and academia, Persona-SQ yields more diverse, persona-aligned, and higher-quality SQs, with potential for on-device privacy-preserving deployment.

Abstract

Suggested questions (SQs) provide an effective initial interface for users to engage with their documents in AI-powered reading applications. In practical reading sessions, users have diverse backgrounds and reading goals, yet current SQ features typically ignore such user information, resulting in homogeneous or ineffective questions. We introduce a pipeline that generates personalized SQs by incorporating reader profiles (professions and reading goals) and demonstrate its utility in two ways: 1) as an improved SQ generation pipeline that produces higher quality and more diverse questions compared to current baselines, and 2) as a data generator to fine-tune extremely small models that perform competitively with much larger models on SQ generation. Our approach can not only serve as a drop-in replacement in current SQ systems to immediately improve their performance but also help develop on-device SQ models that can run locally to deliver fast and private SQ experience.

Persona-SQ: A Personalized Suggested Question Generation Framework For Real-world Documents

TL;DR

Persona-SQ tackles the uniformity of suggested questions by incorporating synthetic reader profiles (professions and reading goals) into the SQ generation process. It introduces a five-step framework that collects documents, synthesizes and quality-controls personas, generates personalized questions, and verifies them. The authors demonstrate two demonstrations: improving LLM-based SQ generation and training tiny models on a 100k-question synthetic dataset that approach larger models in quality. Across automatic and human evaluations in finance, legal, and academia, Persona-SQ yields more diverse, persona-aligned, and higher-quality SQs, with potential for on-device privacy-preserving deployment.

Abstract

Suggested questions (SQs) provide an effective initial interface for users to engage with their documents in AI-powered reading applications. In practical reading sessions, users have diverse backgrounds and reading goals, yet current SQ features typically ignore such user information, resulting in homogeneous or ineffective questions. We introduce a pipeline that generates personalized SQs by incorporating reader profiles (professions and reading goals) and demonstrate its utility in two ways: 1) as an improved SQ generation pipeline that produces higher quality and more diverse questions compared to current baselines, and 2) as a data generator to fine-tune extremely small models that perform competitively with much larger models on SQ generation. Our approach can not only serve as a drop-in replacement in current SQ systems to immediately improve their performance but also help develop on-device SQ models that can run locally to deliver fast and private SQ experience.

Paper Structure

This paper contains 47 sections, 3 equations, 26 figures, 15 tables.

Figures (26)

  • Figure 1: An illustration of Persona-SQ, our personalized suggested question generation pipeline.
  • Figure 2: Examples of the persona coverage ratio (legal). The higher scores of SQs generated with persona compared to those generated without persona indicate the personalized SQs are more aligned to the intended personas.
  • Figure 3: Screenshot01 of the Persona-SQ GPT-4o demo.
  • Figure 4: Screenshot-2 of the Persona-SQ GPT-4o demo.
  • Figure 5: Screenshot of the Persona-SQ fine-tuned demo interface.
  • ...and 21 more figures