Table of Contents
Fetching ...

NeurIDA: Dynamic Modeling for Effective In-Database Analytics

Lingze Zeng, Naili Xing, Shaofeng Cai, Peng Lu, Gang Chen, Jian Pei, Beng Chin Ooi

TL;DR

NeurIDA addresses the rigidity of static ML models in dynamic RDBMS environments by introducing a dynamic in-database modeling paradigm. It automates task understanding, base-model selection, and on-the-fly model augmentation (via DIME) to tailor predictions to complex relational data, all while delivering user-friendly NLQ interfaces and interpretable reports. Across five real-world databases and ten tasks, NeurIDA yields consistent gains in classification AUC-ROC and regression MAE, with ablations confirming the critical roles of dynamic relation modeling and fusion. This work demonstrates a practical, scalable path toward autonomous AI-powered analytics directly inside relational databases, with open-source code to promote adoption.

Abstract

Relational Database Management Systems (RDBMS) manage complex, interrelated data and support a broad spectrum of analytical tasks. With the growing demand for predictive analytics, the deep integration of machine learning (ML) into RDBMS has become critical. However, a fundamental challenge hinders this evolution: conventional ML models are static and task-specific, whereas RDBMS environments are dynamic and must support diverse analytical queries. Each analytical task entails constructing a bespoke pipeline from scratch, which incurs significant development overhead and hence limits wide adoption of ML in analytics. We present NeurIDA, an autonomous end-to-end system for in-database analytics that dynamically "tweaks" the best available base model to better serve a given analytical task. In particular, we propose a novel paradigm of dynamic in-database modeling to pre-train a composable base model architecture over the relational data. Upon receiving a task, NeurIDA formulates the task and data profile to dynamically select and configure relevant components from the pool of base models and shared model components for prediction. For friendly user experience, NeurIDA supports natural language queries; it interprets user intent to construct structured task profiles, and generates analytical reports with dedicated LLM agents. By design, NeurIDA enables ease-of-use and yet effective and efficient in-database AI analytics. Extensive experiment study shows that NeurIDA consistently delivers up to 12% improvement in AUC-ROC and 25% relative reduction in MAE across ten tasks on five real-world datasets. The source code is available at https://github.com/Zrealshadow/NeurIDA

NeurIDA: Dynamic Modeling for Effective In-Database Analytics

TL;DR

NeurIDA addresses the rigidity of static ML models in dynamic RDBMS environments by introducing a dynamic in-database modeling paradigm. It automates task understanding, base-model selection, and on-the-fly model augmentation (via DIME) to tailor predictions to complex relational data, all while delivering user-friendly NLQ interfaces and interpretable reports. Across five real-world databases and ten tasks, NeurIDA yields consistent gains in classification AUC-ROC and regression MAE, with ablations confirming the critical roles of dynamic relation modeling and fusion. This work demonstrates a practical, scalable path toward autonomous AI-powered analytics directly inside relational databases, with open-source code to promote adoption.

Abstract

Relational Database Management Systems (RDBMS) manage complex, interrelated data and support a broad spectrum of analytical tasks. With the growing demand for predictive analytics, the deep integration of machine learning (ML) into RDBMS has become critical. However, a fundamental challenge hinders this evolution: conventional ML models are static and task-specific, whereas RDBMS environments are dynamic and must support diverse analytical queries. Each analytical task entails constructing a bespoke pipeline from scratch, which incurs significant development overhead and hence limits wide adoption of ML in analytics. We present NeurIDA, an autonomous end-to-end system for in-database analytics that dynamically "tweaks" the best available base model to better serve a given analytical task. In particular, we propose a novel paradigm of dynamic in-database modeling to pre-train a composable base model architecture over the relational data. Upon receiving a task, NeurIDA formulates the task and data profile to dynamically select and configure relevant components from the pool of base models and shared model components for prediction. For friendly user experience, NeurIDA supports natural language queries; it interprets user intent to construct structured task profiles, and generates analytical reports with dedicated LLM agents. By design, NeurIDA enables ease-of-use and yet effective and efficient in-database AI analytics. Extensive experiment study shows that NeurIDA consistently delivers up to 12% improvement in AUC-ROC and 25% relative reduction in MAE across ten tasks on five real-world datasets. The source code is available at https://github.com/Zrealshadow/NeurIDA

Paper Structure

This paper contains 15 sections, 13 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: The workflow and key components of NeurIDA.
  • Figure 2: Workflow of the Query Intent Analyzer, illustrating how queries are parsed into the task profile.
  • Figure 3: Overview of the Dynamic In-Database Modeling Engine.
  • Figure 4: Ablation study evaluating the contributions of Dynamic Relation Modeling (w/o Relation) and Dynamic Model Fusion (w/o Fusion).
  • Figure 5: Cost analysis. Numbers on each bar denote the relative computation overhead introduced by NeurIDA.
  • ...and 3 more figures