Table of Contents
Fetching ...

OpenThaiGPT 1.5: A Thai-Centric Open Source Large Language Model

Sumeth Yuenyong, Kobkrit Viriyayudhakorn, Apivadee Piyatumrong, Jillaphat Jaroenkantasima

TL;DR

OpenThaiGPT 1.5 targets strong Thai-language capabilities by finetuning Qwen2.5-based baselines on a large Thai instruction dataset, incorporating RLHF-aligned safety, and adding features such as RAG, tool-calling, and an extended context window. The authors describe a two-model release (7B and 72B) trained with LoRa adapters on Nvidia NeMo, using a diverse mix of Thai and bilingual data, and they validate performance with a dedicated OpenThaiGPT Evaluation Dataset, the Thai Exam Benchmark, and M3Exam. Results show that the 72B OpenThaiGPT 1.5 achieves top open-model performance on Thai benchmarks, notably 63.89% on Thai Exam Benchmark and 70.39% on M3Exam, surpassing several open models and even some closed APIs, with strong results across larger model scales. The work contributes a substantial Thai-centric open LLM resource, including multi-turn capabilities, RAG integration, and tool-calling, which has practical impact for Thai-language AI applications and research, and it establishes a reproducible benchmark ecosystem hosted on HuggingFace.

Abstract

OpenThaiGPT 1.5 is an advanced Thai language chat model based on Qwen v2.5, finetuned on over 2,000,000 Thai instruction pairs. This report provides an engineering perspective on the model's development, capabilities, and performance. We discuss the model's architecture, training process, and key features, including multi-turn conversation support, Retrieval Augmented Generation (RAG) compatibility, and tool-calling functionality. Benchmark results demonstrate OpenThaiGPT 1.5's state-of-the-art performance on various Thai language tasks, outperforming other open-source Thai language models. We also address practical considerations such as GPU memory requirements and deployment strategies.

OpenThaiGPT 1.5: A Thai-Centric Open Source Large Language Model

TL;DR

OpenThaiGPT 1.5 targets strong Thai-language capabilities by finetuning Qwen2.5-based baselines on a large Thai instruction dataset, incorporating RLHF-aligned safety, and adding features such as RAG, tool-calling, and an extended context window. The authors describe a two-model release (7B and 72B) trained with LoRa adapters on Nvidia NeMo, using a diverse mix of Thai and bilingual data, and they validate performance with a dedicated OpenThaiGPT Evaluation Dataset, the Thai Exam Benchmark, and M3Exam. Results show that the 72B OpenThaiGPT 1.5 achieves top open-model performance on Thai benchmarks, notably 63.89% on Thai Exam Benchmark and 70.39% on M3Exam, surpassing several open models and even some closed APIs, with strong results across larger model scales. The work contributes a substantial Thai-centric open LLM resource, including multi-turn capabilities, RAG integration, and tool-calling, which has practical impact for Thai-language AI applications and research, and it establishes a reproducible benchmark ecosystem hosted on HuggingFace.

Abstract

OpenThaiGPT 1.5 is an advanced Thai language chat model based on Qwen v2.5, finetuned on over 2,000,000 Thai instruction pairs. This report provides an engineering perspective on the model's development, capabilities, and performance. We discuss the model's architecture, training process, and key features, including multi-turn conversation support, Retrieval Augmented Generation (RAG) compatibility, and tool-calling functionality. Benchmark results demonstrate OpenThaiGPT 1.5's state-of-the-art performance on various Thai language tasks, outperforming other open-source Thai language models. We also address practical considerations such as GPU memory requirements and deployment strategies.

Paper Structure

This paper contains 19 sections, 4 tables.