Table of Contents
Fetching ...

EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes

Kyunghoon Bae, Eunbi Choi, Kibong Choi, Stanley Jungkyu Choi, Yemuk Choi, Kyubeen Han, Seokhee Hong, Junwon Hwang, Taewan Hwang, Joonwon Jang, Hyojin Jeon, Kijeong Jeon, Gerrard Jeongwon Jo, Hyunjik Jo, Jiyeon Jung, Euisoon Kim, Hyosang Kim, Jihoon Kim, Joonkee Kim, Seonghwan Kim, Soyeon Kim, Sunkyoung Kim, Yireun Kim, Yongil Kim, Youchul Kim, Edward Hwayoung Lee, Gwangho Lee, Haeju Lee, Honglak Lee, Jinsik Lee, Kyungmin Lee, Sangha Park, Young Min Paik, Yongmin Park, Youngyong Park, Sanghyun Seo, Sihoon Yang, Heuiyeen Yeen, Sihyuk Yi, Hyeongu Yun

TL;DR

EXAONE 4.0 integrates Non-reasoning and Reasoning modes from prior EXAONE iterations to enable both rapid responses and deep, rule-based reasoning, while introducing agentic tool use and Spanish multilingual support. The model employsup a hybrid sliding-window/global attention with a $4K$ local window and a $3{:}1$ global-to-local ratio, a repositioned QK-Reorder-LN normalization with RMSNorm, and a substantial pretraining corpus up to $14$T tokens to enhance world knowledge. Context length is expanded to $128K$ tokens via a two-stage extension (4K→32K→128K) with long-context fine-tuning and NIAH validation; a smaller $1.2B$ variant extends to $64K$ context. Post-training combines Large-scale SFT, Reasoning RL via the AGAPO algorithm (with asymmetric sampling and group/global advantages), and hybrid reward-based preference learning to tightly couple Non-reasoning and Reasoning behaviors. Across benchmarks spanning world knowledge, math/coding, long-context understanding, tool use, and multilinguality, EXAONE 4.0 demonstrates strong performance for its scale, competitive tool-use capabilities, and robust world-knowledge tasks, while maintaining a pathway toward practical agentic AI applications and broader language support.

Abstract

This technical report introduces EXAONE 4.0, which integrates a Non-reasoning mode and a Reasoning mode to achieve both the excellent usability of EXAONE 3.5 and the advanced reasoning abilities of EXAONE Deep. To pave the way for the agentic AI era, EXAONE 4.0 incorporates essential features such as agentic tool use, and its multilingual capabilities are extended to support Spanish in addition to English and Korean. The EXAONE 4.0 model series consists of two sizes: a mid-size 32B model optimized for high performance, and a small-size 1.2B model designed for on-device applications. The EXAONE 4.0 demonstrates superior performance compared to open-weight models in its class and remains competitive even against frontier-class models. The models are publicly available for research purposes and can be easily downloaded via https://huggingface.co/LGAI-EXAONE.

EXAONE 4.0: Unified Large Language Models Integrating Non-reasoning and Reasoning Modes

TL;DR

EXAONE 4.0 integrates Non-reasoning and Reasoning modes from prior EXAONE iterations to enable both rapid responses and deep, rule-based reasoning, while introducing agentic tool use and Spanish multilingual support. The model employsup a hybrid sliding-window/global attention with a local window and a global-to-local ratio, a repositioned QK-Reorder-LN normalization with RMSNorm, and a substantial pretraining corpus up to T tokens to enhance world knowledge. Context length is expanded to tokens via a two-stage extension (4K→32K→128K) with long-context fine-tuning and NIAH validation; a smaller variant extends to context. Post-training combines Large-scale SFT, Reasoning RL via the AGAPO algorithm (with asymmetric sampling and group/global advantages), and hybrid reward-based preference learning to tightly couple Non-reasoning and Reasoning behaviors. Across benchmarks spanning world knowledge, math/coding, long-context understanding, tool use, and multilinguality, EXAONE 4.0 demonstrates strong performance for its scale, competitive tool-use capabilities, and robust world-knowledge tasks, while maintaining a pathway toward practical agentic AI applications and broader language support.

Abstract

This technical report introduces EXAONE 4.0, which integrates a Non-reasoning mode and a Reasoning mode to achieve both the excellent usability of EXAONE 3.5 and the advanced reasoning abilities of EXAONE Deep. To pave the way for the agentic AI era, EXAONE 4.0 incorporates essential features such as agentic tool use, and its multilingual capabilities are extended to support Spanish in addition to English and Korean. The EXAONE 4.0 model series consists of two sizes: a mid-size 32B model optimized for high performance, and a small-size 1.2B model designed for on-device applications. The EXAONE 4.0 demonstrates superior performance compared to open-weight models in its class and remains competitive even against frontier-class models. The models are publicly available for research purposes and can be easily downloaded via https://huggingface.co/LGAI-EXAONE.

Paper Structure

This paper contains 43 sections, 2 equations, 7 figures, 13 tables.

Figures (7)

  • Figure 1: Visualization of the hybrid attention mechanism when the window size for local attention (sliding window attention) is set to 3. This figure illustrates how context tokens are processed across layers under the hybrid attention mechanism, highlighting the interaction between local and global attention.
  • Figure 2: Visualization of repositioning layer normalization. The LayerNorm is applied after input queries and keys, and it is performed after attention output again. The type of normalization is RMSNorm.
  • Figure 3: The post-training pipeline of the EXAONE 4.0. The pipeline consists of five stages, which include supervised fine-tuning (SFT), reinforcement learning (RL), and preference learning.
  • Figure 4: Performance of various models across six HELMET task categories, Recall, RAG, Passage Re-ranking, ICL, LongQA, and Summarization, at different context lengths (8K to 128K tokens). Darker cells indicate higher accuracy. Missing entries (N/A) denote models that do not support the corresponding input length or task.
  • Figure 5: Example of Long-dialogue History Understanding (Topic classification) in Ko-LongBench.
  • ...and 2 more figures