Concept Distillation from Strong to Weak Models via Hypotheses-to-Theories Prompting
Emmanuel Aboah Boateng, Cassiano O. Becker, Nabiha Asghar, Kabir Walia, Ashwin Srinivasan, Ehi Nosakhare, Soundar Srinivasan, Victor Dibia
TL;DR
The paper addresses the challenge of costly prompt engineering and model migration by proposing Concept Distillation (CD), a three-phase prompt-optimization framework that distills general concepts from a strong model to a weaker one via hypotheses-to-theories prompting. CD uses initialization, induction, and deduction/verification to generate, validate, and propagate transferable concepts that guide the weak model without fine-tuning. Empirical results on NL2Code and mathematical reasoning tasks show substantial performance gains across multiple weak models and strong cross-model transferability, with notable improvements such as Phi-3-mini-3.8B gaining up to 34% on HumanEval and Claude 2.1 approaching near-perfect accuracy. The approach offers a cost-efficient, scalable solution for prompt optimization and workload migration across evolving language-model landscapes.
Abstract
Hand-crafting high quality prompts to optimize the performance of language models is a complicated and labor-intensive process. Furthermore, when migrating to newer, smaller, or weaker models (possibly due to latency or cost gains), prompts need to be updated to re-optimize the task performance. We propose Concept Distillation (CD), an automatic prompt optimization technique for enhancing weaker models on complex tasks. CD involves: (1) collecting mistakes made by weak models with a base prompt (initialization), (2) using a strong model to generate reasons for these mistakes and create rules/concepts for weak models (induction), and (3) filtering these rules based on validation set performance and integrating them into the base prompt (deduction/verification). We evaluated CD on NL2Code and mathematical reasoning tasks, observing significant performance boosts for small and weaker language models. Notably, Mistral-7B's accuracy on Multi-Arith increased by 20%, and Phi-3-mini-3.8B's accuracy on HumanEval rose by 34%. Compared to other automated methods, CD offers an effective, cost-efficient strategy for improving weak models' performance on complex tasks and enables seamless workload migration across different language models without compromising performance.
