EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes

Kyunghoon Bae; Eunbi Choi; Kibong Choi; Stanley Jungkyu Choi; Yemuk Choi; Kyubeen Han; Seokhee Hong; Junwon Hwang; Taewan Hwang; Joonwon Jang; Hyojin Jeon; Kijeong Jeon; Gerrard Jeongwon Jo; Hyunjik Jo; Jiyeon Jung; Euisoon Kim; Hyosang Kim; Jihoon Kim; Joonkee Kim; Seonghwan Kim; Soyeon Kim; Sunkyoung Kim; Yireun Kim; Yongil Kim; Youchul Kim; Edward Hwayoung Lee; Gwangho Lee; Haeju Lee; Honglak Lee; Jinsik Lee; Kyungmin Lee; Sangha Park; Young Min Paik; Yongmin Park; Youngyong Park; Sanghyun Seo; Sihoon Yang; Heuiyeen Yeen; Sihyuk Yi; Hyeongu Yun

EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes

Kyunghoon Bae, Eunbi Choi, Kibong Choi, Stanley Jungkyu Choi, Yemuk Choi, Kyubeen Han, Seokhee Hong, Junwon Hwang, Taewan Hwang, Joonwon Jang, Hyojin Jeon, Kijeong Jeon, Gerrard Jeongwon Jo, Hyunjik Jo, Jiyeon Jung, Euisoon Kim, Hyosang Kim, Jihoon Kim, Joonkee Kim, Seonghwan Kim, Soyeon Kim, Sunkyoung Kim, Yireun Kim, Yongil Kim, Youchul Kim, Edward Hwayoung Lee, Gwangho Lee, Haeju Lee, Honglak Lee, Jinsik Lee, Kyungmin Lee, Sangha Park, Young Min Paik, Yongmin Park, Youngyong Park, Sanghyun Seo, Sihoon Yang, Heuiyeen Yeen, Sihyuk Yi, Hyeongu Yun

TL;DR

EXAONE 4.0 integrates Non-reasoning and Reasoning modes from prior EXAONE iterations to enable both rapid responses and deep, rule-based reasoning, while introducing agentic tool use and Spanish multilingual support. The model employsup a hybrid sliding-window/global attention with a $4K$ local window and a $3{:}1$ global-to-local ratio, a repositioned QK-Reorder-LN normalization with RMSNorm, and a substantial pretraining corpus up to $14$T tokens to enhance world knowledge. Context length is expanded to $128K$ tokens via a two-stage extension (4K→32K→128K) with long-context fine-tuning and NIAH validation; a smaller $1.2B$ variant extends to $64K$ context. Post-training combines Large-scale SFT, Reasoning RL via the AGAPO algorithm (with asymmetric sampling and group/global advantages), and hybrid reward-based preference learning to tightly couple Non-reasoning and Reasoning behaviors. Across benchmarks spanning world knowledge, math/coding, long-context understanding, tool use, and multilinguality, EXAONE 4.0 demonstrates strong performance for its scale, competitive tool-use capabilities, and robust world-knowledge tasks, while maintaining a pathway toward practical agentic AI applications and broader language support.

Abstract

This technical report introduces EXAONE 4.0, which integrates a Non-reasoning mode and a Reasoning mode to achieve both the excellent usability of EXAONE 3.5 and the advanced reasoning abilities of EXAONE Deep. To pave the way for the agentic AI era, EXAONE 4.0 incorporates essential features such as agentic tool use, and its multilingual capabilities are extended to support Spanish in addition to English and Korean. The EXAONE 4.0 model series consists of two sizes: a mid-size 32B model optimized for high performance, and a small-size 1.2B model designed for on-device applications. The EXAONE 4.0 demonstrates superior performance compared to open-weight models in its class and remains competitive even against frontier-class models. The models are publicly available for research purposes and can be easily downloaded via https://huggingface.co/LGAI-EXAONE.

EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes

TL;DR

Abstract

EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)