Table of Contents
Fetching ...

Towards Building Specialized Generalist AI with System 1 and System 2 Fusion

Kaiyan Zhang, Biqing Qi, Bowen Zhou

TL;DR

This paper proposes Specialized Generalist AI (SGI) as a pragmatic bridge toward AGI, arguing that combining task-specific expertise with broad general abilities can accelerate progress in high-value domains. It defines SGI with three core capabilities—Task Streaming Learning, Autonomous Discovery, and Value-Aligned Optimization—and outlines a three-layer framework (System 1/2 fusion) plus four components to realize this integration. The authors discuss the limitations of pure generalists and pure specialists, emphasize uncertainty as a driver of innovation, and describe pathways for collaborative model/data architectures, new benchmarks, and multi-modal/embodied applications. The work highlights practical challenges and directions, including data mixtures, architectural innovations, safety controls, and iterative self-evolution, positioning SGI as a feasible, scalable route toward Expert AGI with broader societal impact.

Abstract

In this perspective paper, we introduce the concept of Specialized Generalist Artificial Intelligence (SGAI or simply SGI) as a crucial milestone toward Artificial General Intelligence (AGI). Compared to directly scaling general abilities, SGI is defined as AI that specializes in at least one task, surpassing human experts, while also retaining general abilities. This fusion path enables SGI to rapidly achieve high-value areas. We categorize SGI into three stages based on the level of mastery over professional skills and generality performance. Additionally, we discuss the necessity of SGI in addressing issues associated with large language models, such as their insufficient generality, specialized capabilities, uncertainty in innovation, and practical applications. Furthermore, we propose a conceptual framework for developing SGI that integrates the strengths of Systems 1 and 2 cognitive processing. This framework comprises three layers and four key components, which focus on enhancing individual abilities and facilitating collaborative evolution. We conclude by summarizing the potential challenges and suggesting future directions. We hope that the proposed SGI will provide insights into further research and applications towards achieving AGI.

Towards Building Specialized Generalist AI with System 1 and System 2 Fusion

TL;DR

This paper proposes Specialized Generalist AI (SGI) as a pragmatic bridge toward AGI, arguing that combining task-specific expertise with broad general abilities can accelerate progress in high-value domains. It defines SGI with three core capabilities—Task Streaming Learning, Autonomous Discovery, and Value-Aligned Optimization—and outlines a three-layer framework (System 1/2 fusion) plus four components to realize this integration. The authors discuss the limitations of pure generalists and pure specialists, emphasize uncertainty as a driver of innovation, and describe pathways for collaborative model/data architectures, new benchmarks, and multi-modal/embodied applications. The work highlights practical challenges and directions, including data mixtures, architectural innovations, safety controls, and iterative self-evolution, positioning SGI as a feasible, scalable route toward Expert AGI with broader societal impact.

Abstract

In this perspective paper, we introduce the concept of Specialized Generalist Artificial Intelligence (SGAI or simply SGI) as a crucial milestone toward Artificial General Intelligence (AGI). Compared to directly scaling general abilities, SGI is defined as AI that specializes in at least one task, surpassing human experts, while also retaining general abilities. This fusion path enables SGI to rapidly achieve high-value areas. We categorize SGI into three stages based on the level of mastery over professional skills and generality performance. Additionally, we discuss the necessity of SGI in addressing issues associated with large language models, such as their insufficient generality, specialized capabilities, uncertainty in innovation, and practical applications. Furthermore, we propose a conceptual framework for developing SGI that integrates the strengths of Systems 1 and 2 cognitive processing. This framework comprises three layers and four key components, which focus on enhancing individual abilities and facilitating collaborative evolution. We conclude by summarizing the potential challenges and suggesting future directions. We hope that the proposed SGI will provide insights into further research and applications towards achieving AGI.
Paper Structure (26 sections, 3 figures, 1 table)

This paper contains 26 sections, 3 figures, 1 table.

Figures (3)

  • Figure 1: The role of Specialized Generalist Intelligence (SGI) is a crucial milestone toward Artificial General Intelligence (AGI). The implementation path of AGI encompasses two dimensions: speciality and generality. The development of speciality, as detailed in “Levels of AGI” morris2024position, is often compared with human intelligence. The differences in generality lie in the number of skills held, which also depend on distinct learning paradigms. While the "Scaling Law" has led to significant improvements in the generality of LLMs, progress in speciality remains extremely slow. Upon reviewing the development of technology, a high-value area emerges for AGI applications that require models to possess both robust generality and adequate specialty. This optimal balance point is referred to as the "tipping point" for specialized generalists.
  • Figure 2: This image presents the features of Systems 1 and 2 in human cognition, and illustrates their progression from machine learning and deep learning to LLMs in artificial intelligence, emphasizing their evolutionary journey and applications over time.
  • Figure 3: The three layers and four key components of our proposed theoretical framework for building specialized generalists from a System 1 and System 2 fusion perspective include the development of both systems (➊ and ➋), their collaboration (➌), and the self-evolving of dual-system (➍). The x-axis represents the two systems, trending more towards System 2 as slow, rational thinking increases. The y-axis represents potential collaborations involving internal representations and external behaviors among the systems, with human readability improving as collaborations shift from representations to behaviors.