Birbal: An efficient 7B instruct-model fine-tuned with curated datasets
Ashvini Kumar Jindal, Pawan Kumar Rajpoot, Ankur Parikh
TL;DR
Birbal showcases that a compact, open-source LLM can be effectively instruction-tuned on a single GPU within a day using carefully curated data. By selecting Mistral-7B as the base and employing 4-bit QLoRA with targeted data curation, the approach achieves strong cross-task performance, outperforming competing entries in a multi-stage NeurIPS efficiency challenge. The work emphasizes transparency, reproducibility, and accessibility, offering an open-source pipeline and datasets to democratize efficient LLM fine-tuning. This has practical implications for researchers and practitioners with limited compute, illustrating that high-quality instruction data can compensate for hardware constraints while maintaining broad task coverage.
Abstract
LLMOps incur significant costs due to hardware requirements, hindering their widespread accessibility. Additionally, a lack of transparency in model training methods and data contributes to the majority of models being non-reproducible. To tackle these challenges, the LLM Efficiency Challenge was introduced at NeurIPS Workshop, aiming to adapt foundation models on a diverse set of tasks via fine-tuning on a single GPU (RTX 4090 or A100 with 40GB) within a 24-hour timeframe. In this system description paper, we introduce Birbal, our Mistral-7B based winning model, fine-tuned on a single RTX 4090 for 16 hours. Birbal's success lies in curating high-quality instructions covering diverse tasks, resulting in a 35% performance improvement over second-best Qwen-14B based submission.
