LLM-FS-Agent: A Deliberative Role-based Large Language Model Architecture for Transparent Feature Selection
Mohamed Bal-Ghaoui, Fayssal Sabri
TL;DR
This work tackles high-dimensionality and interpretability in feature selection by introducing LLM-FS-Agent, a deliberative, role-based multi-agent LLM framework that generates transparent justifications for feature importance. The architecture enlists Initiator, Refiner, Challenger, and Judge agents to debate feature relevance, yielding a final score $S_{\text{final}} = w_r \cdot S_{\text{refined}} + w_c \cdot S_{\text{challenged}}$ with $w_r + w_c = 1$. Evaluated on the CIC-DIAD 2024 IoT intrusion detection dataset, LLM-FS-Agent outperforms or matches strong baselines (including LLM-Select and PCA) while reducing downstream training time by 46% (p = 0.028, $\text{Cohe}\n'$n's d = 0.87). The results demonstrate that deliberation improves decision transparency and computational efficiency, with robust performance across multiple LLM backbones and downstream classifiers. Overall, the approach offers a practical, auditable solution for high-stakes domains like cybersecurity intrusion detection.
Abstract
High-dimensional data remains a pervasive challenge in machine learning, often undermining model interpretability and computational efficiency. While Large Language Models (LLMs) have shown promise for dimensionality reduction through feature selection, existing LLM-based approaches frequently lack structured reasoning and transparent justification for their decisions. This paper introduces LLM-FS-Agent, a novel multi-agent architecture designed for interpretable and robust feature selection. The system orchestrates a deliberative "debate" among multiple LLM agents, each assigned a specific role, enabling collective evaluation of feature relevance and generation of detailed justifications. We evaluate LLM-FS-Agent in the cybersecurity domain using the CIC-DIAD 2024 IoT intrusion detection dataset and compare its performance against strong baselines, including LLM-Select and traditional methods such as PCA. Experimental results demonstrate that LLM-FS-Agent consistently achieves superior or comparable classification performance while reducing downstream training time by an average of 46% (statistically significant improvement, p = 0.028 for XGBoost). These findings highlight that the proposed deliberative architecture enhances both decision transparency and computational efficiency, establishing LLM-FS-Agent as a practical and reliable solution for real-world applications.
