Agentic Feature Augmentation: Unifying Selection and Generation with Teaming, Planning, and Memories
Nanxu Gong, Sixun Dong, Haoyue Bai, Xinyuan Wang, Wangyang Ying, Yanjie Fu
TL;DR
This paper tackles the decoupled nature of feature selection and generation by proposing MAGS, a router-generator-selector three-agent system that treats feature sets as token sequences and uses long- and short-term memories alongside offline reinforcement learning to plan feature augmentations. The approach enables a unified, planner-driven search over a large discrete feature space, guided by in-context learning and memory-augmented prompts to enhance feature quality and reduce redundancy. Empirical results across six datasets show that MAGS consistently outperforms diverse baselines, with ablations confirming the importance of the router, memories, and RL fine-tuning for achieving robust, task-specific feature representations. While effective, the method incurs computational overhead and token-limit constraints inherent to LLM-based and memory-augmented architectures, and its gains are most pronounced in dataset- and task-specific settings.
Abstract
As a widely-used and practical tool, feature engineering transforms raw data into discriminative features to advance AI model performance. However, existing methods usually apply feature selection and generation separately, failing to strive a balance between reducing redundancy and adding meaningful dimensions. To fill this gap, we propose an agentic feature augmentation concept, where the unification of feature generation and selection is modeled as agentic teaming and planning. Specifically, we develop a Multi-Agent System with Long and Short-Term Memory (MAGS), comprising a selector agent to eliminate redundant features, a generator agent to produce informative new dimensions, and a router agent that strategically coordinates their actions. We leverage in-context learning with short-term memory for immediate feedback refinement and long-term memory for globally optimal guidance. Additionally, we employ offline Proximal Policy Optimization (PPO) reinforcement fine-tuning to train the router agent for effective decision-making to navigate a vast discrete feature space. Extensive experiments demonstrate that this unified agentic framework consistently achieves superior task performance by intelligently orchestrating feature selection and generation.
