Table of Contents
Fetching ...

Collaborative Multi-Agent Reinforcement Learning for Automated Feature Transformation with Graph-Driven Path Optimization

Xiaohan Huang, Dongjie Wang, Zhiyuan Ning, Ziyue Qiao, Qingqing Long, Haowei Zhu, Yi Du, Min Wu, Yuanchun Zhou, Meng Xiao

TL;DR

TCTO presents a graph-driven, collaborative MARL framework for automated feature transformation on tabular data. By evolving a traceable transformation roadmap, performing group-wise clustering, and coordinating three agents with a dual reward structure, it achieves improved downstream performance while maintaining interpretability through pruning and backtracking. Empirical results across diverse datasets demonstrate robust gains and scalable behavior, with ablations confirming the roadmap and history-aware components as key drivers. The work highlights the practical impact of roadmap-based feature engineering and points to future integration with large language models and broader scientific domains.

Abstract

Feature transformation methods aim to find an optimal mathematical feature-feature crossing process that generates high-value features and improves the performance of downstream machine learning tasks. Existing frameworks, though designed to mitigate manual costs, often treat feature transformations as isolated operations, ignoring dynamic dependencies between transformation steps. To address the limitations, we propose TCTO, a collaborative multi-agent reinforcement learning framework that automates feature engineering through graph-driven path optimization. The framework's core innovation lies in an evolving interaction graph that models features as nodes and transformations as edges. Through graph pruning and backtracking, it dynamically eliminates low-impact edges, reduces redundant operations, and enhances exploration stability. This graph also provides full traceability to empower TCTO to reuse high-utility subgraphs from historical transformations. To demonstrate the efficacy and adaptability of our approach, we conduct comprehensive experiments and case studies, which show superior performance across a range of datasets.

Collaborative Multi-Agent Reinforcement Learning for Automated Feature Transformation with Graph-Driven Path Optimization

TL;DR

TCTO presents a graph-driven, collaborative MARL framework for automated feature transformation on tabular data. By evolving a traceable transformation roadmap, performing group-wise clustering, and coordinating three agents with a dual reward structure, it achieves improved downstream performance while maintaining interpretability through pruning and backtracking. Empirical results across diverse datasets demonstrate robust gains and scalable behavior, with ablations confirming the roadmap and history-aware components as key drivers. The work highlights the practical impact of roadmap-based feature engineering and points to future integration with large language models and broader scientific domains.

Abstract

Feature transformation methods aim to find an optimal mathematical feature-feature crossing process that generates high-value features and improves the performance of downstream machine learning tasks. Existing frameworks, though designed to mitigate manual costs, often treat feature transformations as isolated operations, ignoring dynamic dependencies between transformation steps. To address the limitations, we propose TCTO, a collaborative multi-agent reinforcement learning framework that automates feature engineering through graph-driven path optimization. The framework's core innovation lies in an evolving interaction graph that models features as nodes and transformations as edges. Through graph pruning and backtracking, it dynamically eliminates low-impact edges, reduces redundant operations, and enhances exploration stability. This graph also provides full traceability to empower TCTO to reuse high-utility subgraphs from historical transformations. To demonstrate the efficacy and adaptability of our approach, we conduct comprehensive experiments and case studies, which show superior performance across a range of datasets.

Paper Structure

This paper contains 29 sections, 8 equations, 11 figures, 3 tables.

Figures (11)

  • Figure 1: High-quality features contribute to the performance of machine learning models.
  • Figure 2: The technical contributions summarization.
  • Figure 3: An example of feature transformation roadmap update: the feature $f_h$ conducts $sin$ operation generating the feature $f_t$. The embedding of node $v_t$ can be derived from the statistic description of generated feature $f_t$.
  • Figure 4: An overview of our framework: (a) cluster and represent the nodes on roadmap; (b) represent the node clusters; (c) reinforce multi-agent feature transformation decision generation; (d)generate high-quality features; (e) prune the roadmap.
  • Figure 5: The reinforcement learning decision process. Three agents collaborate to generate a binary transformation.
  • ...and 6 more figures