Table of Contents
Fetching ...

Learning from models beyond fine-tuning

Hongling Zheng, Li Shen, Anke Tang, Yong Luo, Han Hu, Bo Du, Yonggang Wen, Dacheng Tao

TL;DR

This survey defines Learn From Model (LFM) as leveraging foundation models through interfaces to improve downstream tasks without direct data access. It systematically catalogs five LFM paradigms—model tuning, model distillation, model reuse, meta-learning, and model editing—and details concrete methods and trade-offs within each. The authors discuss practical considerations such as data privacy, compute requirements, robustness, and security, and offer future directions including loss design, retrieval metrics, efficient fusion, and data-free learning. Overall, the work outlines a comprehensive roadmap for exploiting existing models to solve new tasks while highlighting key challenges and opportunities for future research and deployment.

Abstract

Foundation models (FM) have demonstrated remarkable performance across a wide range of tasks (especially in the fields of natural language processing and computer vision), primarily attributed to their ability to comprehend instructions and access extensive, high-quality data. This not only showcases their current effectiveness but also sets a promising trajectory towards the development of artificial general intelligence. Unfortunately, due to multiple constraints, the raw data of the model used for large model training are often inaccessible, so the use of end-to-end models for downstream tasks has become a new research trend, which we call Learn From Model (LFM) in this article. LFM focuses on the research, modification, and design of FM based on the model interface, so as to better understand the model structure and weights (in a black box environment), and to generalize the model to downstream tasks. The study of LFM techniques can be broadly categorized into five major areas: model tuning, model distillation, model reuse, meta learning and model editing. Each category encompasses a repertoire of methods and strategies that aim to enhance the capabilities and performance of FM. This paper gives a comprehensive review of the current methods based on FM from the perspective of LFM, in order to help readers better understand the current research status and ideas. To conclude, we summarize the survey by highlighting several critical areas for future exploration and addressing open issues that require further attention from the research community. The relevant papers we investigated in this article can be accessed at https://github.com/ruthless-man/Awesome-Learn-from-Model

Learning from models beyond fine-tuning

TL;DR

This survey defines Learn From Model (LFM) as leveraging foundation models through interfaces to improve downstream tasks without direct data access. It systematically catalogs five LFM paradigms—model tuning, model distillation, model reuse, meta-learning, and model editing—and details concrete methods and trade-offs within each. The authors discuss practical considerations such as data privacy, compute requirements, robustness, and security, and offer future directions including loss design, retrieval metrics, efficient fusion, and data-free learning. Overall, the work outlines a comprehensive roadmap for exploiting existing models to solve new tasks while highlighting key challenges and opportunities for future research and deployment.

Abstract

Foundation models (FM) have demonstrated remarkable performance across a wide range of tasks (especially in the fields of natural language processing and computer vision), primarily attributed to their ability to comprehend instructions and access extensive, high-quality data. This not only showcases their current effectiveness but also sets a promising trajectory towards the development of artificial general intelligence. Unfortunately, due to multiple constraints, the raw data of the model used for large model training are often inaccessible, so the use of end-to-end models for downstream tasks has become a new research trend, which we call Learn From Model (LFM) in this article. LFM focuses on the research, modification, and design of FM based on the model interface, so as to better understand the model structure and weights (in a black box environment), and to generalize the model to downstream tasks. The study of LFM techniques can be broadly categorized into five major areas: model tuning, model distillation, model reuse, meta learning and model editing. Each category encompasses a repertoire of methods and strategies that aim to enhance the capabilities and performance of FM. This paper gives a comprehensive review of the current methods based on FM from the perspective of LFM, in order to help readers better understand the current research status and ideas. To conclude, we summarize the survey by highlighting several critical areas for future exploration and addressing open issues that require further attention from the research community. The relevant papers we investigated in this article can be accessed at https://github.com/ruthless-man/Awesome-Learn-from-Model
Paper Structure (38 sections, 17 equations, 9 figures, 1 table)

This paper contains 38 sections, 17 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: The structural taxonomy for LFM. The survey is organized according to the hierarchical structure.
  • Figure 2: Two fundamental approaches in machine learning. The "learn from data" and "learn from model" are two fundamental approaches that play a vital role in training models and improving their performance. Both approaches have their strengths and applications. LFD: The "learn from data" approach involves training models by using large amounts of labeled or unlabeled data. This approach relies on the premise that patterns and relationships within the data can be learned by the model to make accurate predictions. LDM: The "learn from model" approach involves leveraging the knowledge and insights gained from existing FM to improve model performance. Rather than starting from scratch with raw data, this approach utilizes FM as a foundation and builds upon them.
  • Figure 3: The basic structure of Adapter Tuning houlsby2019parameter.
  • Figure 4: LLaVA network architecture liu2023visual
  • Figure 5: REALM augments language model pre-training with a neural knowledge retriever that retrieves knowledge from a textual knowledge corpus guu2020retrieval.
  • ...and 4 more figures