Online Training of Large Language Models: Learn while chatting

Juhao Liang; Ziwei Wang; Zhuoheng Ma; Jianquan Li; Zhiyi Zhang; Xiangbo Wu; Benyou Wang

Online Training of Large Language Models: Learn while chatting

Juhao Liang, Ziwei Wang, Zhuoheng Ma, Jianquan Li, Zhiyi Zhang, Xiangbo Wu, Benyou Wang

TL;DR

This paper addresses the limitations of static, offline or session-limited online learning in LLMs by proposing Online Training using External Interactions, a paradigm that enables persistent, real-time updates via instruction-based, document-driven, and web-search-driven learning. It defines three learning modalities, a moderation framework, and a design philosophy centered on lifelong learning, personalization, accessibility, and user empowerment. A case study on tool learning demonstrates that online training can achieve substantial accuracy gains with relatively small training data and favorable inference efficiency compared to traditional full fine-tuning. The work advances practical, user-centric LLM customization, promising scalable deployment and continual adaptation across domains while highlighting challenges related to knowledge injection, persistency, and deployment complexity.

Abstract

Large Language Models(LLMs) have dramatically revolutionized the field of Natural Language Processing(NLP), offering remarkable capabilities that have garnered widespread usage. However, existing interaction paradigms between LLMs and users are constrained by either inflexibility, limitations in customization, or a lack of persistent learning. This inflexibility is particularly evident as users, especially those without programming skills, have restricted avenues to enhance or personalize the model. Existing frameworks further complicate the model training and deployment process due to their computational inefficiencies and lack of user-friendly interfaces. To overcome these challenges, this paper introduces a novel interaction paradigm-'Online Training using External Interactions'-that merges the benefits of persistent, real-time model updates with the flexibility for individual customization through external interactions such as AI agents or online/offline knowledge bases.

Online Training of Large Language Models: Learn while chatting

TL;DR

Abstract

Paper Structure (29 sections, 4 figures, 2 tables)

This paper contains 29 sections, 4 figures, 2 tables.

Introduction
Related Work and Motivation
Related Work
Offline parameter-variant paradigm
Online parameter-invariant paradigm
Pros and Cons of Existing Paradigms
Motivation
Motivation
User Interface: online training using external interactions
Overall design
Philosophy
The three interactions
Content Moderation Control
Benefits and potential.
Lifelong Learning
...and 14 more sections

Figures (4)

Figure 1: The figure depicts the manner in which dialogues are conducted between LLM and user within our interactive mode. Notably, users issue distinct directives, each leading to the trigger of three distinct training processes. Furthermore, the figure underscores the model's ability to retain knowledge acquired during prior conversational session, even when transitioning across different conversation sessions.
Figure 2: This figure delineates our comprehensive workflow of chat-based online training. During the interaction between the user and the model, the user issues learning instructions to trigger the learning process. Three different learning methods correspond to three data augmentation techniques with the generated data as input to train new model. Then new model replace the old one seamlessly, allowing the user to continue the conversation.
Figure 3: Overview of the experimental design
Figure 4: The results of the experiment, where the symbol * refers to the average of three experiments with random seed.

Online Training of Large Language Models: Learn while chatting

TL;DR

Abstract

Online Training of Large Language Models: Learn while chatting

Authors

TL;DR

Abstract

Table of Contents

Figures (4)