Unveiling Challenges for LLMs in Enterprise Data Engineering

Jan-Micha Bodensohn; Ulf Brackmann; Liane Vogel; Anupam Sanghi; Carsten Binnig

Unveiling Challenges for LLMs in Enterprise Data Engineering

Jan-Micha Bodensohn, Ulf Brackmann, Liane Vogel, Anupam Sanghi, Carsten Binnig

TL;DR

This work identifies key enterprise-specific challenges related to data, tasks, and background knowledge and extensively evaluates how they affect data engineering with LLMs, revealing that LLMs face substantial limitations in real-world enterprise scenarios, with accuracy declining sharply.

Abstract

Large Language Models (LLMs) promise to automate data engineering on tabular data, offering enterprises a valuable opportunity to cut the high costs of manual data handling. But the enterprise domain comes with unique challenges that existing LLM-based approaches for data engineering often overlook, such as large table sizes, more complex tasks, and the need for internal knowledge. To bridge these gaps, we identify key enterprise-specific challenges related to data, tasks, and background knowledge and extensively evaluate how they affect data engineering with LLMs. Our analysis reveals that LLMs face substantial limitations in real-world enterprise scenarios, with accuracy declining sharply. Our findings contribute to a systematic understanding of LLMs for enterprise data engineering to support their adoption in industry.

Unveiling Challenges for LLMs in Enterprise Data Engineering

TL;DR

Abstract

Unveiling Challenges for LLMs in Enterprise Data Engineering

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (14)