SLIM-RAFT: A Novel Fine-Tuning Approach to Improve Cross-Linguistic Performance for Mercosur Common Nomenclature
Vinícius Di Oliveira, Yuri Façanha Bezerra, Li Weigang, Pedro Carvalho Brom, Victor Rafael R. Celestino
TL;DR
The paper tackles the lack of robust cross-linguistic NLP for Mercosur NCM coding by proposing SLIM-RAFT, a cost-efficient fine-tuning approach that uses a compact Portuguese LLM (TeenyTineLLaMA) and a simplified retrieval-augmented framework. By preserving a streamlined chain-of-thought and excluding expensive distractor-document training, SLIM-RAFT trains a small TTL model to map brief product descriptions to NCM categories with high accuracy. In evaluations on 100 unseen QA pairs, SLIM-RAFT achieves 8.63/10 (SD 2.30), outperforming TeenyTineLLaMA baselines and ChatGPT-4, demonstrating that a domain-tuned, smaller LLM can beat larger models in specialized tasks while reducing costs. The approach is adaptable to HS/NCM coding globally and suggests a path toward broader multilingual, domain-specific fine-tuning with smaller models. Future work includes scaling to larger LLMs like LLaMA 3, expanding multilingual capabilities, and benchmarking against LoRa-based fine-tuning methods.
Abstract
Natural language processing (NLP) has seen significant advancements with the advent of large language models (LLMs). However, substantial improvements are still needed for languages other than English, especially for specific domains like the applications of Mercosur Common Nomenclature (NCM), a Brazilian Harmonized System (HS). To address this gap, this study uses TeenyTineLLaMA, a foundational Portuguese LLM, as an LLM source to implement the NCM application processing. Additionally, a simplified Retrieval-Augmented Fine-Tuning (RAFT) technique, termed SLIM-RAFT, is proposed for task-specific fine-tuning of LLMs. This approach retains the chain-of-thought (CoT) methodology for prompt development in a more concise and streamlined manner, utilizing brief and focused documents for training. The proposed model demonstrates an efficient and cost-effective alternative for fine-tuning smaller LLMs, significantly outperforming TeenyTineLLaMA and ChatGPT-4 in the same task. Although the research focuses on NCM applications, the methodology can be easily adapted for HS applications worldwide.
