Table of Contents
Fetching ...

KAUCUS: Knowledge Augmented User Simulators for Training Language Model Assistants

Kaustubh D. Dhole

TL;DR

The paper tackles the problem of producing diverse, knowledge-rich user-simulated data to train instruction-following LLM assistants. It introduces Kaucus, a three-stage framework that uses retrieval augmentation (SRAG) and summary control (SCTRL) to equip simulators with external text, enabling better downstream assistants. The authors train GPT-J-6B-based simulators from Anthropic/Open Assistant data and evaluate them via intrinsic diversity metrics and reward/preference-based downstream performance, finding that knowledge augmentation consistently improves results. This work offers a practical path to leverage web-scale text for faster, more capable assistant development, while acknowledging limitations such as retriever choice and the need for human evaluation.

Abstract

An effective multi-turn instruction-following assistant can be developed by creating a simulator that can generate useful interaction data. Apart from relying on its intrinsic weights, an ideal user simulator should also be able to bootstrap external knowledge rapidly in its raw form to simulate the multifarious diversity of text available over the internet. Previous user simulators generally lacked diversity, were mostly closed domain, and necessitated rigid schema making them inefficient to rapidly scale to incorporate external knowledge. In this regard, we introduce, Kaucus, a Knowledge-Augmented User Simulator framework, to outline a process of creating diverse user simulators, that can seamlessly exploit external knowledge as well as benefit downstream assistant model training. Through two GPT-J based simulators viz., a Retrieval Augmented Simulator and a Summary Controlled Simulator we generate diverse simulator-assistant interactions. Through reward and preference model-based evaluations, we find that these interactions serve as useful training data and create more helpful downstream assistants. We also find that incorporating knowledge through retrieval augmentation or summary control helps create better assistants.

KAUCUS: Knowledge Augmented User Simulators for Training Language Model Assistants

TL;DR

The paper tackles the problem of producing diverse, knowledge-rich user-simulated data to train instruction-following LLM assistants. It introduces Kaucus, a three-stage framework that uses retrieval augmentation (SRAG) and summary control (SCTRL) to equip simulators with external text, enabling better downstream assistants. The authors train GPT-J-6B-based simulators from Anthropic/Open Assistant data and evaluate them via intrinsic diversity metrics and reward/preference-based downstream performance, finding that knowledge augmentation consistently improves results. This work offers a practical path to leverage web-scale text for faster, more capable assistant development, while acknowledging limitations such as retriever choice and the need for human evaluation.

Abstract

An effective multi-turn instruction-following assistant can be developed by creating a simulator that can generate useful interaction data. Apart from relying on its intrinsic weights, an ideal user simulator should also be able to bootstrap external knowledge rapidly in its raw form to simulate the multifarious diversity of text available over the internet. Previous user simulators generally lacked diversity, were mostly closed domain, and necessitated rigid schema making them inefficient to rapidly scale to incorporate external knowledge. In this regard, we introduce, Kaucus, a Knowledge-Augmented User Simulator framework, to outline a process of creating diverse user simulators, that can seamlessly exploit external knowledge as well as benefit downstream assistant model training. Through two GPT-J based simulators viz., a Retrieval Augmented Simulator and a Summary Controlled Simulator we generate diverse simulator-assistant interactions. Through reward and preference model-based evaluations, we find that these interactions serve as useful training data and create more helpful downstream assistants. We also find that incorporating knowledge through retrieval augmentation or summary control helps create better assistants.
Paper Structure (16 sections, 8 figures, 2 tables)

This paper contains 16 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: The complete three step framework of Kaucus -- creating, utilizing and evaluating a user simulator.
  • Figure 2: The format of the conversations used for training S1 (a vanilla simulator), SRAG (retrieved document shown in green), and SCTRL (summary shown in red).
  • Figure 3: FastChat Evaluation of Assistants created from Utterance Grounded Simulators (A1 and ARAG) against baseline assistant (A0)
  • Figure 4: FastChat Evaluation of Assistants created from Summary Controlled Simulators (-CTRL) against baseline assistant (A0)
  • Figure 5: SteamSHP reward model Evaluation of Assistants created from Utterance Grounded against baseline assistant (A0)
  • ...and 3 more figures