Table of Contents
Fetching ...

Agentic Feature Augmentation: Unifying Selection and Generation with Teaming, Planning, and Memories

Nanxu Gong, Sixun Dong, Haoyue Bai, Xinyuan Wang, Wangyang Ying, Yanjie Fu

TL;DR

This paper tackles the decoupled nature of feature selection and generation by proposing MAGS, a router-generator-selector three-agent system that treats feature sets as token sequences and uses long- and short-term memories alongside offline reinforcement learning to plan feature augmentations. The approach enables a unified, planner-driven search over a large discrete feature space, guided by in-context learning and memory-augmented prompts to enhance feature quality and reduce redundancy. Empirical results across six datasets show that MAGS consistently outperforms diverse baselines, with ablations confirming the importance of the router, memories, and RL fine-tuning for achieving robust, task-specific feature representations. While effective, the method incurs computational overhead and token-limit constraints inherent to LLM-based and memory-augmented architectures, and its gains are most pronounced in dataset- and task-specific settings.

Abstract

As a widely-used and practical tool, feature engineering transforms raw data into discriminative features to advance AI model performance. However, existing methods usually apply feature selection and generation separately, failing to strive a balance between reducing redundancy and adding meaningful dimensions. To fill this gap, we propose an agentic feature augmentation concept, where the unification of feature generation and selection is modeled as agentic teaming and planning. Specifically, we develop a Multi-Agent System with Long and Short-Term Memory (MAGS), comprising a selector agent to eliminate redundant features, a generator agent to produce informative new dimensions, and a router agent that strategically coordinates their actions. We leverage in-context learning with short-term memory for immediate feedback refinement and long-term memory for globally optimal guidance. Additionally, we employ offline Proximal Policy Optimization (PPO) reinforcement fine-tuning to train the router agent for effective decision-making to navigate a vast discrete feature space. Extensive experiments demonstrate that this unified agentic framework consistently achieves superior task performance by intelligently orchestrating feature selection and generation.

Agentic Feature Augmentation: Unifying Selection and Generation with Teaming, Planning, and Memories

TL;DR

This paper tackles the decoupled nature of feature selection and generation by proposing MAGS, a router-generator-selector three-agent system that treats feature sets as token sequences and uses long- and short-term memories alongside offline reinforcement learning to plan feature augmentations. The approach enables a unified, planner-driven search over a large discrete feature space, guided by in-context learning and memory-augmented prompts to enhance feature quality and reduce redundancy. Empirical results across six datasets show that MAGS consistently outperforms diverse baselines, with ablations confirming the importance of the router, memories, and RL fine-tuning for achieving robust, task-specific feature representations. While effective, the method incurs computational overhead and token-limit constraints inherent to LLM-based and memory-augmented architectures, and its gains are most pronounced in dataset- and task-specific settings.

Abstract

As a widely-used and practical tool, feature engineering transforms raw data into discriminative features to advance AI model performance. However, existing methods usually apply feature selection and generation separately, failing to strive a balance between reducing redundancy and adding meaningful dimensions. To fill this gap, we propose an agentic feature augmentation concept, where the unification of feature generation and selection is modeled as agentic teaming and planning. Specifically, we develop a Multi-Agent System with Long and Short-Term Memory (MAGS), comprising a selector agent to eliminate redundant features, a generator agent to produce informative new dimensions, and a router agent that strategically coordinates their actions. We leverage in-context learning with short-term memory for immediate feedback refinement and long-term memory for globally optimal guidance. Additionally, we employ offline Proximal Policy Optimization (PPO) reinforcement fine-tuning to train the router agent for effective decision-making to navigate a vast discrete feature space. Extensive experiments demonstrate that this unified agentic framework consistently achieves superior task performance by intelligently orchestrating feature selection and generation.

Paper Structure

This paper contains 23 sections, 3 equations, 11 figures, 4 tables.

Figures (11)

  • Figure 1: Example of feature selection, feature generation, and unifying feature selection and generation.
  • Figure 2: Framework overview. The left section illustrates the overall framework, where a router is first employed to determine whether to perform feature generation or selection , followed by the execution of the corresponding operation. The right section depicts the construction of long and short-term memory representations, along with the offline reinforcement learning process used to train the router.
  • Figure 4: Case study. We visualize the reconstructed feature set on the dataset openml_586.
  • Figure 5: Problem template of router.
  • Figure 6: Problem template of selector.
  • ...and 6 more figures