Improving GenIR Systems Based on User Feedback

Qingyao Ai; Zhicheng Dou; Min Zhang

Improving GenIR Systems Based on User Feedback

Qingyao Ai, Zhicheng Dou, Min Zhang

TL;DR

This paper tackles improving GenIR systems through extended user feedback, redefining who the user is in the GenIR era. It surveys strategies to inject feedback into prompts, indexing, fine-tuning, and alignment to match user factors. It analyzes alignment objectives and methods, including RLHF, RLAIF, and RLCF, and optimization techniques PPO, DPO, RRHF, and RAFT, tailored to information access tasks. It also discusses continual learning, conversational learning and ranking, and prompt-learning approaches, and identifies challenges such as user intention understanding, data-efficient feedback, and privacy concerns, guiding future research.

Abstract

In this chapter, we discuss how to improve the GenIR systems based on user feedback. Before describing the approaches, it is necessary to be aware that the concept of "user" has been extended in the interactions with the GenIR systems. Different types of feedback information and strategies are also provided. Then the alignment techniques are highlighted in terms of objectives and methods. Following this, various ways of learning from user feedback in GenIR are presented, including continual learning, learning and ranking in the conversational context, and prompt learning. Through this comprehensive exploration, it becomes evident that innovative techniques are being proposed beyond traditional methods of utilizing user feedback, and contribute significantly to the evolution of GenIR in the new era. We also summarize some challenging topics and future directions that require further investigation.

Improving GenIR Systems Based on User Feedback

TL;DR

Abstract

Paper Structure (16 sections, 3 figures, 1 table)

This paper contains 16 sections, 3 figures, 1 table.

Introduction
Concept of User in GenIR Era
User Feedback
Strategies for GenIR System Improvement with User Feedback
Alignment with User Factor in GenIR
Alignment Objective
Objectives Shared by General LLM Applications
Objectives Unique to Information Accessing
Alignment Method
Collection of Rewards
Parameter Optimization
Learning from User Feedback in GenIR
Continual Learning
Learning and Ranking in Conversation Context
Prompt Learning
...and 1 more sections

Figures (3)

Figure 1: Different types of indexing as prompt input to GenIR systems proposed by geng2022recommendation.
Figure 2: Illustrations of LLMs application in document summarization for similar documents provided by dong2023aligning. The distinctive parts of each document are highlighted in different colors.
Figure 3: An illustration of example reward collection and optimization methods in LLM alignment.

Improving GenIR Systems Based on User Feedback

TL;DR

Abstract

Improving GenIR Systems Based on User Feedback

Authors

TL;DR

Abstract

Table of Contents

Figures (3)