Data-Centric Artificial Intelligence

Johannes Jakubik; Michael Vössing; Niklas Kühl; Jannis Walk; Gerhard Satzger

Data-Centric Artificial Intelligence

Johannes Jakubik, Michael Vössing, Niklas Kühl, Jannis Walk, Gerhard Satzger

TL;DR

The paper argues that progress in AI has been overly model-centric, underutilizing the potential of systematic data design. It proposes data-centric AI as a complementary paradigm focused on refining and extending data (R1–R6 and E1–E3) to improve model performance plus maintainability, with a detailed framework and implications for Business & Information Systems Engineering across individual, organizational, and cross-organizational levels. Key contributions include clarifying terminology, outlining a two-dimensional data framework, and highlighting practical IS implications, governance needs, and tool support. The work underscores the strategic value of data work, domain knowledge, and data governance in real-world AI deployments and calls for BISE research to advance these practices.

Abstract

Data-centric artificial intelligence (data-centric AI) represents an emerging paradigm emphasizing that the systematic design and engineering of data is essential for building effective and efficient AI-based systems. The objective of this article is to introduce practitioners and researchers from the field of Information Systems (IS) to data-centric AI. We define relevant terms, provide key characteristics to contrast the data-centric paradigm to the model-centric one, and introduce a framework for data-centric AI. We distinguish data-centric AI from related concepts and discuss its longer-term implications for the IS community.

Data-Centric Artificial Intelligence

TL;DR

Abstract

Paper Structure (9 sections, 4 figures, 1 table)

This paper contains 9 sections, 4 figures, 1 table.

Introduction
Model-centric and Data-centric AI
Dimensions of Data-centric AI
Delimitations of Data-centric AI from Related Concepts
Implications for BISE Research
Individual Level
Organizational Level
Cross-Organizational Level
Conclusion

Figures (4)

Figure 1: Data-centric AI as an emerging, complementary paradigm for the development of AI-based systems.
Figure 2: Framework for the systematic design and engineering of data for data-centric AI.
Figure 3: Proposed areas of BISE research for the advancement of data-centric AI.
Figure 4: Extending the Cross Industry Standard Processes for Data Mining (CRISP-DM) based on considerations from data-centric AI.

Data-Centric Artificial Intelligence

TL;DR

Abstract

Data-Centric Artificial Intelligence

Authors

TL;DR

Abstract

Table of Contents

Figures (4)