Table of Contents
Fetching ...

AutoAdapt: An Automated Domain Adaptation Framework for LLMs

Sidharth Sinha, Anson Bastos, Xuchao Zhang, Akshay Nambi, Chetan Bansal, Saravan Rajmohan

TL;DR

This work presents AutoAdapt, a novel end-to-end automated framework for efficient and reliable LLM domain adaptation that leverages curated knowledge bases from literature and open-source resources to reduce expert intervention and optimize hyperparameters under tight budgets.

Abstract

Large language models (LLMs) excel in open domains but struggle in specialized settings with limited data and evolving knowledge. Existing domain adaptation practices rely heavily on manual trial-and-error processes, incur significant hyperparameter complexity, and are highly sensitive to data and user preferences, all under the high cost of LLM training. Moreover, the interactions and transferability of hyperparameter choices across models/domains remain poorly understood, making adaptation gains uncertain even with substantial effort. To solve these challenges, we present AutoAdapt, a novel end-to-end automated framework for efficient and reliable LLM domain adaptation. AutoAdapt leverages curated knowledge bases from literature and open-source resources to reduce expert intervention. To narrow the search space, we design a novel multi-agent debating system in which proposal and critic agents iteratively interact to align user intent and incorporate data signals and best practices into the planning process. To optimize hyperparameters under tight budgets, we propose AutoRefine, a novel LLM-based surrogate that replaces costly black-box search. Across 10 tasks, AutoAdapt achieves a 25% average relative accuracy improvement over state-of-the-art Automated Machine Learning baselines with minimal overhead.

AutoAdapt: An Automated Domain Adaptation Framework for LLMs

TL;DR

This work presents AutoAdapt, a novel end-to-end automated framework for efficient and reliable LLM domain adaptation that leverages curated knowledge bases from literature and open-source resources to reduce expert intervention and optimize hyperparameters under tight budgets.

Abstract

Large language models (LLMs) excel in open domains but struggle in specialized settings with limited data and evolving knowledge. Existing domain adaptation practices rely heavily on manual trial-and-error processes, incur significant hyperparameter complexity, and are highly sensitive to data and user preferences, all under the high cost of LLM training. Moreover, the interactions and transferability of hyperparameter choices across models/domains remain poorly understood, making adaptation gains uncertain even with substantial effort. To solve these challenges, we present AutoAdapt, a novel end-to-end automated framework for efficient and reliable LLM domain adaptation. AutoAdapt leverages curated knowledge bases from literature and open-source resources to reduce expert intervention. To narrow the search space, we design a novel multi-agent debating system in which proposal and critic agents iteratively interact to align user intent and incorporate data signals and best practices into the planning process. To optimize hyperparameters under tight budgets, we propose AutoRefine, a novel LLM-based surrogate that replaces costly black-box search. Across 10 tasks, AutoAdapt achieves a 25% average relative accuracy improvement over state-of-the-art Automated Machine Learning baselines with minimal overhead.
Paper Structure (36 sections, 3 theorems, 30 equations, 11 figures, 8 tables, 1 algorithm)

This paper contains 36 sections, 3 theorems, 30 equations, 11 figures, 8 tables, 1 algorithm.

Key Result

Proposition 3.1

Let $\mathcal{H}(c)$ be the feasible configuration space and $\mathcal{P}(c) = \mathcal{P}_1 \times \cdots \times \mathcal{P}_T$ the prior‑guided subspace. Denote the expected task loss as $L_{\mathbb{E}}(h)$, the optimal loss over the full space is $L_{\mathcal{H}}^{*} = \min_{h\in \mathcal{H}(c)} where $L$ is the Lipschitz constant and $\varepsilon$ is the distance between the optimal parameter

Figures (11)

  • Figure 1: AutoAdapt takes as input the user instruction and dataset and delivers the domain adapted LLM based on user constraint
  • Figure 2: Framework Overview: AutoAdapt processes user data and task definitions, integrates best practices, generates an executable training pipeline, and refines an intermediate model to produce a deliverable model for users.
  • Figure 3: Hierarchical Configuration Determination Using a Domain Adaptation Configuration Graph (ACG)
  • Figure 4: Success Rate (SR), Normalized Performance Score (NPS), Cumulative Score (CS) comparing AutoAdapt with baseline methods across datasets. Higher score indicates better results. AutoAdapt outperforms SoTA baselines across datasets. Detailed results are in $\S$\ref{['apndx_res']}.
  • Figure 5: (a) Ablation and (b) hyperparameter study varying number of multi-agent debate rounds and AutoRefine trials.
  • ...and 6 more figures

Theorems & Definitions (7)

  • Proposition 3.1
  • Lemma 4.1
  • Lemma 3.1: Confidence bounds srinivas2010gaussian
  • proof
  • proof
  • Remark 3.2: On LLM guidance and constants
  • Remark 3.3: On iteration complexity