Towards Automated Machine Learning Research
Shervin Ardeshir
TL;DR
The paper proposes a top-down, LLM-driven framework for automated ML research that generates and evaluates novel ML components via a Generator–Validator–Evaluator loop, augmented by a Reward model that ranks hypotheses using $B\text{-}WR$ and $BSOTA\text{-}WR$. It treats hypothesis generation as a cross-domain exploration problem, enabling the discovery of non-traditional components (e.g., activation functions, preprocessors, regularizers) and testing them in a lightweight, one-pass setting before committing substantial compute. An empirical study generates 36,000 hypotheses across three component types with three LLMs and two prompting strategies, demonstrating that while most hypotheses underperform, a subset can outperform baselines, and reward models generalize across LLM sources. The work highlights risks such as reward collapse and inductive bias toward SOTA methods, and outlines future directions to broaden component coverage, refine the reward mechanism, and move toward differentiable integration and broader credit assignment, thereby advancing automated ML research.
Abstract
This paper explores a top-down approach to automating incremental advances in machine learning research through component-level innovation, facilitated by Large Language Models (LLMs). Our framework systematically generates novel components, validates their feasibility, and evaluates their performance against existing baselines. A key distinction of this approach lies in how these novel components are generated. Unlike traditional AutoML and NAS methods, which often rely on a bottom-up combinatorial search over predefined, hardcoded base components, our method leverages the cross-domain knowledge embedded in LLMs to propose new components that may not be confined to any hard-coded predefined set. By incorporating a reward model to prioritize promising hypotheses, we aim to improve the efficiency of the hypothesis generation and evaluation process. We hope this approach offers a new avenue for exploration and contributes to the ongoing dialogue in the field.
