Table of Contents
Fetching ...

Towards Automatic Continual Learning: A Self-Adaptive Framework for Continual Instruction Tuning

Peiyi Lin, Fukai Zhang, Kai Niu, Hao Fu

TL;DR

This work tackles domain-specific continual instruction tuning by addressing data quality and deployment constraints. It introduces a self-adaptive framework that dynamically filters incrementally acquired data using a small proxy model for perplexity-based scoring, updating the proxy in lockstep with the deployed model to track distribution shifts. The system integrates data generation, multi-criteria data filtering, LoRA-based continual fine-tuning, and automatic checkpoint evaluation in an iterative update loop, enabling seamless, non-disruptive model updates. In real-world medical scenarios, the approach reduces training data and computation by approximately 66.7% while improving performance and enabling autonomous updates. The work demonstrates practical viability for automatic continual instruction tuning in sensitive domains and provides a foundation for further enhancements in data selection and deployment automation.

Abstract

Continual instruction tuning enables large language models (LLMs) to learn incrementally while retaining past knowledge, whereas existing methods primarily focus on how to retain old knowledge rather than on selecting which new knowledge to learn. In domain-specific contexts, maintaining data quality and managing system constraints remain key challenges. To address these issues, we propose an automated continual instruction tuning framework that dynamically filters incoming data, which identify and reduce redundant data across successive updates. Our approach utilizes a small proxy model for efficient perplexity-based filtering, and updates the proxy to ensure that the filtering criteria remain aligned with the evolving state of the deployed model. Compared to existing static data selection methods, our framework can effectively handle incrementally acquired data and shifting distributions. Additionally, it addresses practical deployment challenges by enabling seamless model updates, supporting version rollback and incorporating automatic checkpoint evaluation. We evaluated the system in real-world medical scenarios. It reduced computational costs by 66.7% and improved model performance, and achieved autonomous updates, thus demonstrating its effectiveness for automatic continual instruction tuning.

Towards Automatic Continual Learning: A Self-Adaptive Framework for Continual Instruction Tuning

TL;DR

This work tackles domain-specific continual instruction tuning by addressing data quality and deployment constraints. It introduces a self-adaptive framework that dynamically filters incrementally acquired data using a small proxy model for perplexity-based scoring, updating the proxy in lockstep with the deployed model to track distribution shifts. The system integrates data generation, multi-criteria data filtering, LoRA-based continual fine-tuning, and automatic checkpoint evaluation in an iterative update loop, enabling seamless, non-disruptive model updates. In real-world medical scenarios, the approach reduces training data and computation by approximately 66.7% while improving performance and enabling autonomous updates. The work demonstrates practical viability for automatic continual instruction tuning in sensitive domains and provides a foundation for further enhancements in data selection and deployment automation.

Abstract

Continual instruction tuning enables large language models (LLMs) to learn incrementally while retaining past knowledge, whereas existing methods primarily focus on how to retain old knowledge rather than on selecting which new knowledge to learn. In domain-specific contexts, maintaining data quality and managing system constraints remain key challenges. To address these issues, we propose an automated continual instruction tuning framework that dynamically filters incoming data, which identify and reduce redundant data across successive updates. Our approach utilizes a small proxy model for efficient perplexity-based filtering, and updates the proxy to ensure that the filtering criteria remain aligned with the evolving state of the deployed model. Compared to existing static data selection methods, our framework can effectively handle incrementally acquired data and shifting distributions. Additionally, it addresses practical deployment challenges by enabling seamless model updates, supporting version rollback and incorporating automatic checkpoint evaluation. We evaluated the system in real-world medical scenarios. It reduced computational costs by 66.7% and improved model performance, and achieved autonomous updates, thus demonstrating its effectiveness for automatic continual instruction tuning.

Paper Structure

This paper contains 18 sections, 7 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: An illustration of the full framework. The data generation module utilizes the Chain-of-Thought (CoT) approach to generate synthetic data when the dataset contains only instructions. The data filtering module processes data based on length, diversity, and quality as a criterion. The model tuning module updates the currently deployed model, producing a checkpoint candidate, which is then assessed by the model evaluation module. If the candidate outperforms the current model, both the deployed model and the proxy model used for quality measurement are iteratively updated.
  • Figure 2: An illustration of the dynamic criteria. The proxy model used for calculating perplexity would yield a different output after tuning, resulting in the identification of redundant information.