Table of Contents
Fetching ...

Typhoon T1: An Open Thai Reasoning Model

Pittawat Taveekitworachai, Potsawee Manakul, Kasima Tharnpipitchai, Kunat Pipatanakul

TL;DR

Typhoon T1 presents an open Thai reasoning model built via supervised fine-tuning to generate reasoning traces in Thai. It proposes structured thinking with XML scratchpad tags and a data-mixed SFT pipeline, enabling cross-domain reasoning without RL. Across experiments, structured thinking improves math and coding tasks, while data size, domain composition, and Thai translation shape performance trade-offs in multilingual settings. Allowing the model to choose its own reasoning language yields the best results in multilingual settings. The work provides an openly available recipe—including datasets, pipelines, configurations, and weights—to advance open reasoning research for low-resource languages.

Abstract

This paper introduces Typhoon T1, an open effort to develop an open Thai reasoning model. A reasoning model is a relatively new type of generative model built on top of large language models (LLMs). A reasoning model generates a long chain of thought before arriving at a final answer, an approach found to improve performance on complex tasks. However, details on developing such a model are limited, especially for reasoning models that can generate traces in a low-resource language. Typhoon T1 presents an open effort that dives into the details of developing a reasoning model in a more cost-effective way by leveraging supervised fine-tuning using open datasets, instead of reinforcement learning. This paper shares the details about synthetic data generation and training, as well as our dataset and model weights. Additionally, we provide insights gained from developing a reasoning model that generalizes across domains and is capable of generating reasoning traces in a low-resource language, using Thai as an example. We hope this open effort provides a foundation for further research in this field.

Typhoon T1: An Open Thai Reasoning Model

TL;DR

Typhoon T1 presents an open Thai reasoning model built via supervised fine-tuning to generate reasoning traces in Thai. It proposes structured thinking with XML scratchpad tags and a data-mixed SFT pipeline, enabling cross-domain reasoning without RL. Across experiments, structured thinking improves math and coding tasks, while data size, domain composition, and Thai translation shape performance trade-offs in multilingual settings. Allowing the model to choose its own reasoning language yields the best results in multilingual settings. The work provides an openly available recipe—including datasets, pipelines, configurations, and weights—to advance open reasoning research for low-resource languages.

Abstract

This paper introduces Typhoon T1, an open effort to develop an open Thai reasoning model. A reasoning model is a relatively new type of generative model built on top of large language models (LLMs). A reasoning model generates a long chain of thought before arriving at a final answer, an approach found to improve performance on complex tasks. However, details on developing such a model are limited, especially for reasoning models that can generate traces in a low-resource language. Typhoon T1 presents an open effort that dives into the details of developing a reasoning model in a more cost-effective way by leveraging supervised fine-tuning using open datasets, instead of reinforcement learning. This paper shares the details about synthetic data generation and training, as well as our dataset and model weights. Additionally, we provide insights gained from developing a reasoning model that generalizes across domains and is capable of generating reasoning traces in a low-resource language, using Thai as an example. We hope this open effort provides a foundation for further research in this field.

Paper Structure

This paper contains 33 sections, 6 figures, 8 tables.

Figures (6)

  • Figure 1: Top: The transformation-and-refinement pipeline used for long-thinking data generation described in Sections \ref{['sec:data_mixture']} and \ref{['sec:data_pipeline']}. Bottom-Left: The structured long-thinking (the best thinking format) training pipeline for Typhoon T, as described in \ref{['sec:thinking_formats']}. Bottom-Right: The bilingual English-Thai Typhoon T1 model training pipeline detailed in \ref{['sec:thai_thinking']}.
  • Figure 2: Differences between three thinking formats: (a) Unstructured thinking, where no XML structural tags are included; (b) Semi-structured thinking, which is similar to unstructured thinking but adds <thoughts> and <response> tags to separate thoughts and responses; (c) Structured thinking, which introduces additional XML tags for structural purposes in the thoughts section.
  • Figure 3: Increasing the proportion of the training set beyond 75% results in performance degradation for some datasets, while GSM8K generally shows a trend of performance improvement with the proportion.
  • Figure 4: Final performance comparison of Typhoon T1-EN and Typhoon T1 against the baseline Typhoon T1 3B Instruct model across six evaluation benchmarks.
  • Figure 5: This figures show domain distribution of the training set for the experiments.
  • ...and 1 more figures