Table of Contents
Fetching ...

Can Low-Rank Knowledge Distillation in LLMs be Useful for Microelectronic Reasoning?

Nirjhor Rouf, Fin Amin, Paul D. Franzon

TL;DR

The goal is to investigate and evaluate a contemporary language model’s ability to function as a microelectronic Q&A expert as well as its reasoning, and generation capabilities in solving microelectronic-related problems.

Abstract

In this work, we present empirical results regarding the feasibility of using offline large language models (LLMs) in the context of electronic design automation (EDA). The goal is to investigate and evaluate a contemporary language model's (Llama-2-7B) ability to function as a microelectronic Q & A expert as well as its reasoning, and generation capabilities in solving microelectronic-related problems. Llama-2-7B was tested across a variety of adaptation methods, including introducing a novel low-rank knowledge distillation (LoRA-KD) scheme. Our experiments produce both qualitative and quantitative results.

Can Low-Rank Knowledge Distillation in LLMs be Useful for Microelectronic Reasoning?

TL;DR

The goal is to investigate and evaluate a contemporary language model’s ability to function as a microelectronic Q&A expert as well as its reasoning, and generation capabilities in solving microelectronic-related problems.

Abstract

In this work, we present empirical results regarding the feasibility of using offline large language models (LLMs) in the context of electronic design automation (EDA). The goal is to investigate and evaluate a contemporary language model's (Llama-2-7B) ability to function as a microelectronic Q & A expert as well as its reasoning, and generation capabilities in solving microelectronic-related problems. Llama-2-7B was tested across a variety of adaptation methods, including introducing a novel low-rank knowledge distillation (LoRA-KD) scheme. Our experiments produce both qualitative and quantitative results.
Paper Structure (12 sections, 3 equations, 2 figures, 2 tables)

This paper contains 12 sections, 3 equations, 2 figures, 2 tables.

Figures (2)

  • Figure 1: LoRA-KD works by first fine-tuning the teacher model using LoRA. Afterward, the teacher is frozen and its outputs are used for equation \ref{['eq: kd_loss']}. Note that only the low-rank $A$ and $B$ parameters of the student are updated.
  • Figure 2: These charts show histograms of which configurations were ranked in the top half and declared the worst according to third-year microelectronics students. Survey participants had to order the outputs of each configuration on 15 questions. A total of 51 rankings were considered after filtering for quality.