Large Language Models Synergize with Automated Machine Learning

Jinglue Xu; Jialong Li; Zhen Liu; Nagar Anthel Venkatesh Suryanarayanan; Guoyuan Zhou; Jia Guo; Hitoshi Iba; Kenji Tei

Large Language Models Synergize with Automated Machine Learning

Jinglue Xu, Jialong Li, Zhen Liu, Nagar Anthel Venkatesh Suryanarayanan, Guoyuan Zhou, Jia Guo, Hitoshi Iba, Kenji Tei

TL;DR

This work presents Text-to-ML, a framework that automatically generates and optimizes complete ML programs from textual task descriptions by decomposing programs into modular components and validating their compatibility with a novel testing approach. It couples autoregressive LLM code generation with autoML, using zero-cost proxies to efficiently search the space of module configurations and hyperparameters. Theoretical results show a best-case linear complexity $O( ext{Γ})$ for Contextual Modular Generation, contrasting with a worst-case exponential $O( ext{exp}( ext{Γ}))$ for traditional methods, and experiments across 12 tasks demonstrate improvements in 10 of them, with autoML further boosting performance. The approach promises a path toward democratized, end-to-end automatic ML programming, backed by an open-source implementation and a robust testing framework that ensures module compatibility and numerical evaluation consistency.

Abstract

Recently, program synthesis driven by large language models (LLMs) has become increasingly popular. However, program synthesis for machine learning (ML) tasks still poses significant challenges. This paper explores a novel form of program synthesis, targeting ML programs, by combining LLMs and automated machine learning (autoML). Specifically, our goal is to fully automate the generation and optimization of the code of the entire ML workflow, from data preparation to modeling and post-processing, utilizing only textual descriptions of the ML tasks. To manage the length and diversity of ML programs, we propose to break each ML program into smaller, manageable parts. Each part is generated separately by the LLM, with careful consideration of their compatibilities. To ensure compatibilities, we design a testing technique for ML programs. Unlike traditional program synthesis, which typically relies on binary evaluations (i.e., correct or incorrect), evaluating ML programs necessitates more than just binary judgments. Our approach automates the numerical evaluation and optimization of these programs, selecting the best candidates through autoML techniques. In experiments across various ML tasks, our method outperforms existing methods in 10 out of 12 tasks for generating ML programs. In addition, autoML significantly improves the performance of the generated ML programs. In experiments, given the textual task description, our method, Text-to-ML, generates the complete and optimized ML program in a fully autonomous process. The implementation of our method is available at https://github.com/JLX0/llm-automl.

Large Language Models Synergize with Automated Machine Learning

TL;DR

for Contextual Modular Generation, contrasting with a worst-case exponential

for traditional methods, and experiments across 12 tasks demonstrate improvements in 10 of them, with autoML further boosting performance. The approach promises a path toward democratized, end-to-end automatic ML programming, backed by an open-source implementation and a robust testing framework that ensures module compatibility and numerical evaluation consistency.

Abstract

Paper Structure (29 sections, 2 theorems, 4 equations, 8 figures, 4 tables, 2 algorithms)

This paper contains 29 sections, 2 theorems, 4 equations, 8 figures, 4 tables, 2 algorithms.

Introduction
Background, challenges, and motivation
Automatic code generation
Automatic unit testing
Automatic code optimization
Summary
Related Works
Large language models
Automated machine learning
Software engineering
Generation of Lengthy and Complex Sequences
Decomposition and compatibility
Formal definition
Theoretical analysis
Unit Tests for Machine Learning
...and 14 more sections

Key Result

Theorem 3.5

For any sequence generation process, if there exists a step in the process that generates the entire output sequence and Assumption ass:1 holds for the step, then the best case complexity of $\mathbb{E}_{x\sim{P}}[G(\Gamma)]$ of the process is $\mathcal{O}(exp(\Gamma))$.

Figures (8)

Figure 1: One-click program synthesis for machine learning. This figure presents an example of our qualitative results based on the dataset Learning. Our method, Text-to-ML, transforms a textual task description into an optimized program, encompassing data preparation, modeling, post processing, and hyperparameters. All modules are seamlessly integrated into a single executable program. The entire process, from the user's input of a textual description to the creation of an end-to-end, optimized, and executable program, is fully automated. The code examples in the figure are folded and abbreviated for readability.
Figure 2: Program synthesis driven by autoregressive LLMs. The desired functions are provided as instructions to the LLM, which generates the corresponding code. An optional testing phase can be included, where the code is tested for errors, and the test results provide feedback for the LLM to debug and refine the code.
Figure 3: Text-to-ML: a three-phase process of generation, testing, and optimization. Initially, with the task description, the LLM generates the modules. The modules are verified by unit tests to ensure inter-compatibility. With the compatibility, they are automatically integrated. Then, iterate to produce multiple candidate programs. Finally, autoML selects the best program and associated hyperparameters.
Figure 4: The autoML step in our method. Each solution is a program that implements a candidate combination of data preparation, modeling, and post-processing algorithms and associated hyperparameters. First, the optimization algorithm chooses a solution. This chosen solution is then trained and evaluated, which provides feedback to improve selections in subsequent iterations.
Figure 5: Program synthesis for ML. (1) black rectangle: the goal of this study. (2) red rectangles: challenges. (3) green rectangles: methods or features of methods. (4) solid rectangles: novel goals, methods, or features introduced in our study. (5) dashed rectangles: existing challenges or methods.
...and 3 more figures

Theorems & Definitions (7)

Definition 3.1
Definition 3.2
Remark 3.3
Theorem 3.5
Remark 3.6
Theorem 3.8
Remark 3.9

Large Language Models Synergize with Automated Machine Learning

TL;DR

Abstract

Large Language Models Synergize with Automated Machine Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (7)