Table of Contents
Fetching ...

Multi-Teacher Knowledge Distillation with Reinforcement Learning for Visual Recognition

Chuanguang Yang, Xinqiang Yu, Han Yang, Zhulin An, Chengqing Yu, Libo Huang, Yongjun Xu

TL;DR

This work tackles the challenge of balancing knowledge transfer from a pool of teachers to a student in visual recognition by formulating multi-teacher KD as an RL problem. An agent observes a state that encodes both teacher performance and teacher–student gaps and outputs per-sample weights $w_i^m$ to weight each teacher's contribution in the KD loss; the agent is updated via policy gradient using rewards derived from the student's performance. The approach achieves state-of-the-art results across image classification, object detection, and semantic segmentation on standard benchmarks, with ablation studies highlighting the benefits of jointly considering teacher performance and teacher–student gaps. Overall, MTKD-RL demonstrates that data-driven, sample-wise weighting guided by RL can surpass entropy-based or meta-learning strategies in multi-teacher KD, with practical impact on dense prediction tasks and scalable to large datasets.

Abstract

Multi-teacher Knowledge Distillation (KD) transfers diverse knowledge from a teacher pool to a student network. The core problem of multi-teacher KD is how to balance distillation strengths among various teachers. Most existing methods often develop weighting strategies from an individual perspective of teacher performance or teacher-student gaps, lacking comprehensive information for guidance. This paper proposes Multi-Teacher Knowledge Distillation with Reinforcement Learning (MTKD-RL) to optimize multi-teacher weights. In this framework, we construct both teacher performance and teacher-student gaps as state information to an agent. The agent outputs the teacher weight and can be updated by the return reward from the student. MTKD-RL reinforces the interaction between the student and teacher using an agent in an RL-based decision mechanism, achieving better matching capability with more meaningful weights. Experimental results on visual recognition tasks, including image classification, object detection, and semantic segmentation tasks, demonstrate that MTKD-RL achieves state-of-the-art performance compared to the existing multi-teacher KD works.

Multi-Teacher Knowledge Distillation with Reinforcement Learning for Visual Recognition

TL;DR

This work tackles the challenge of balancing knowledge transfer from a pool of teachers to a student in visual recognition by formulating multi-teacher KD as an RL problem. An agent observes a state that encodes both teacher performance and teacher–student gaps and outputs per-sample weights to weight each teacher's contribution in the KD loss; the agent is updated via policy gradient using rewards derived from the student's performance. The approach achieves state-of-the-art results across image classification, object detection, and semantic segmentation on standard benchmarks, with ablation studies highlighting the benefits of jointly considering teacher performance and teacher–student gaps. Overall, MTKD-RL demonstrates that data-driven, sample-wise weighting guided by RL can surpass entropy-based or meta-learning strategies in multi-teacher KD, with practical impact on dense prediction tasks and scalable to large datasets.

Abstract

Multi-teacher Knowledge Distillation (KD) transfers diverse knowledge from a teacher pool to a student network. The core problem of multi-teacher KD is how to balance distillation strengths among various teachers. Most existing methods often develop weighting strategies from an individual perspective of teacher performance or teacher-student gaps, lacking comprehensive information for guidance. This paper proposes Multi-Teacher Knowledge Distillation with Reinforcement Learning (MTKD-RL) to optimize multi-teacher weights. In this framework, we construct both teacher performance and teacher-student gaps as state information to an agent. The agent outputs the teacher weight and can be updated by the return reward from the student. MTKD-RL reinforces the interaction between the student and teacher using an agent in an RL-based decision mechanism, achieving better matching capability with more meaningful weights. Experimental results on visual recognition tasks, including image classification, object detection, and semantic segmentation tasks, demonstrate that MTKD-RL achieves state-of-the-art performance compared to the existing multi-teacher KD works.

Paper Structure

This paper contains 26 sections, 14 equations, 2 figures, 7 tables, 2 algorithms.

Figures (2)

  • Figure 1: Overview of the basic idea about our proposed MTKD-RL.
  • Figure 2: Parameter analyses and ablation study over ShuffleNetV2 on CIFAR-100.