Table of Contents
Fetching ...

Knowledge-Informed Automatic Feature Extraction via Collaborative Large Language Model Agents

Henrik Bradland, Morten Goodwin, Vladimir I. Zadorozhny, Per-Arne Andersen

TL;DR

Rogue One introduces a knowledge-informed AutoFE framework built on three collaborative LLM agents (Scientist, Extractor, Tester) that iteratively generate and validate features. It combines a flooding-pruning strategy with a retrieval-augmented generation (RAG) knowledge base to integrate external domain knowledge, yielding semantically meaningful and interpretable features. Empirically, Rogue One outperforms state-of-the-art AutoFE methods on 19 classification and 9 regression datasets, achieving high accuracy and robust performance, while also surfacing novel hypotheses such as a potential biomarker in a myocardial dataset. This work demonstrates the value of multi-agent collaboration and rich qualitative feedback for feature discovery, offering a scalable and interpretable approach to knowledge-informed AutoFE with potential implications for scientific discovery in medicine, finance, and engineering.

Abstract

The performance of machine learning models on tabular data is critically dependent on high-quality feature engineering. While Large Language Models (LLMs) have shown promise in automating feature extraction (AutoFE), existing methods are often limited by monolithic LLM architectures, simplistic quantitative feedback, and a failure to systematically integrate external domain knowledge. This paper introduces Rogue One, a novel, LLM-based multi-agent framework for knowledge-informed automatic feature extraction. Rogue One operationalizes a decentralized system of three specialized agents-Scientist, Extractor, and Tester-that collaborate iteratively to discover, generate, and validate predictive features. Crucially, the framework moves beyond primitive accuracy scores by introducing a rich, qualitative feedback mechanism and a "flooding-pruning" strategy, allowing it to dynamically balance feature exploration and exploitation. By actively incorporating external knowledge via an integrated retrieval-augmented (RAG) system, Rogue One generates features that are not only statistically powerful but also semantically meaningful and interpretable. We demonstrate that Rogue One significantly outperforms state-of-the-art methods on a comprehensive suite of 19 classification and 9 regression datasets. Furthermore, we show qualitatively that the system surfaces novel, testable hypotheses, such as identifying a new potential biomarker in the myocardial dataset, underscoring its utility as a tool for scientific discovery.

Knowledge-Informed Automatic Feature Extraction via Collaborative Large Language Model Agents

TL;DR

Rogue One introduces a knowledge-informed AutoFE framework built on three collaborative LLM agents (Scientist, Extractor, Tester) that iteratively generate and validate features. It combines a flooding-pruning strategy with a retrieval-augmented generation (RAG) knowledge base to integrate external domain knowledge, yielding semantically meaningful and interpretable features. Empirically, Rogue One outperforms state-of-the-art AutoFE methods on 19 classification and 9 regression datasets, achieving high accuracy and robust performance, while also surfacing novel hypotheses such as a potential biomarker in a myocardial dataset. This work demonstrates the value of multi-agent collaboration and rich qualitative feedback for feature discovery, offering a scalable and interpretable approach to knowledge-informed AutoFE with potential implications for scientific discovery in medicine, finance, and engineering.

Abstract

The performance of machine learning models on tabular data is critically dependent on high-quality feature engineering. While Large Language Models (LLMs) have shown promise in automating feature extraction (AutoFE), existing methods are often limited by monolithic LLM architectures, simplistic quantitative feedback, and a failure to systematically integrate external domain knowledge. This paper introduces Rogue One, a novel, LLM-based multi-agent framework for knowledge-informed automatic feature extraction. Rogue One operationalizes a decentralized system of three specialized agents-Scientist, Extractor, and Tester-that collaborate iteratively to discover, generate, and validate predictive features. Crucially, the framework moves beyond primitive accuracy scores by introducing a rich, qualitative feedback mechanism and a "flooding-pruning" strategy, allowing it to dynamically balance feature exploration and exploitation. By actively incorporating external knowledge via an integrated retrieval-augmented (RAG) system, Rogue One generates features that are not only statistically powerful but also semantically meaningful and interpretable. We demonstrate that Rogue One significantly outperforms state-of-the-art methods on a comprehensive suite of 19 classification and 9 regression datasets. Furthermore, we show qualitatively that the system surfaces novel, testable hypotheses, such as identifying a new potential biomarker in the myocardial dataset, underscoring its utility as a tool for scientific discovery.

Paper Structure

This paper contains 21 sections, 3 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: The Rogue One system architecture operates as a continuous iterative loop, organized into three core stages, each managed by a specialized agent. The cycle commences with the Scientist Agent (blue background), which analyzes the Test Pool data from prior iterations to define a new Focus Area. Subsequently, the Extractor Agent (green) leverages this Focus Area to generate new candidate features, which are then appended to the central Feature Pool. Finally, the Tester Agent (yellow) orchestrates the evaluation and pruning phase. It employs ML models to generate Performance Metrics and a Feature Assessment, which together update the Test Pool. Concurrently, its Feature Pruning function refines the Feature Pool, completing the loop and preparing the system for the next iteration.
  • Figure 2: The number of features in the best solution found by Rogue One for the tabular datasets (Tables \ref{['tab:results_classification']} and \ref{['tab:results_regression']}) plotted against the number of entries ($n$) and attributes ($p$) in the raw data on a log-log scale.
  • Figure 3: Normalized RMSE scores and the current number of features in the feature pool (feature count) for various iterations of the Rogue One operating on the Bike dataset (see Table \ref{['tab:results_regression']}). Note: iterations 5, 6, and 10 are not evaluated as the Extractor Agent did not produce any new features.
  • Figure 4: The relation between runtime for Rogue One and the number of entities of the datasets. Pearson correlation $\rho = 0.949$.