Table of Contents
Fetching ...

Domain Gating Ensemble Networks for AI-Generated Text Detection

Arihant Tripathi, Liam Dugan, Charis Gao, Maggie Huan, Emma Jin, Peter Zhang, David Zhang, Julia Zhao, Chris Callison-Burch

TL;DR

DoGEN introduces Domain Gating Ensemble Networks to detect AI-generated text across diverse domains by using a domain-router to gate a set of domain-specific detectors. It ensembles the top-$k$ experts weighted by domain-probabilities, enabling robust performance in both in-domain and out-of-domain settings while keeping inference efficient. Across MAGE and RAID benchmarks, DoGEN achieves state-of-the-art in-domain results and strong out-of-domain generalization, outperforming single models and many ensembles with substantially fewer activated parameters. The approach is modular and extensible, allowing easy addition of new experts for structurally novel inputs, and the authors provide code and trained models for community use.

Abstract

As state-of-the-art language models continue to improve, the need for robust detection of machine-generated text becomes increasingly critical. However, current state-of-the-art machine text detectors struggle to adapt to new unseen domains and generative models. In this paper we present DoGEN (Domain Gating Ensemble Networks), a technique that allows detectors to adapt to unseen domains by ensembling a set of domain expert detector models using weights from a domain classifier. We test DoGEN on a wide variety of domains from leading benchmarks and find that it achieves state-of-the-art performance on in-domain detection while outperforming models twice its size on out-of-domain detection. We release our code and trained models to assist in future research in domain-adaptive AI detection.

Domain Gating Ensemble Networks for AI-Generated Text Detection

TL;DR

DoGEN introduces Domain Gating Ensemble Networks to detect AI-generated text across diverse domains by using a domain-router to gate a set of domain-specific detectors. It ensembles the top- experts weighted by domain-probabilities, enabling robust performance in both in-domain and out-of-domain settings while keeping inference efficient. Across MAGE and RAID benchmarks, DoGEN achieves state-of-the-art in-domain results and strong out-of-domain generalization, outperforming single models and many ensembles with substantially fewer activated parameters. The approach is modular and extensible, allowing easy addition of new experts for structurally novel inputs, and the authors provide code and trained models for community use.

Abstract

As state-of-the-art language models continue to improve, the need for robust detection of machine-generated text becomes increasingly critical. However, current state-of-the-art machine text detectors struggle to adapt to new unseen domains and generative models. In this paper we present DoGEN (Domain Gating Ensemble Networks), a technique that allows detectors to adapt to unseen domains by ensembling a set of domain expert detector models using weights from a domain classifier. We test DoGEN on a wide variety of domains from leading benchmarks and find that it achieves state-of-the-art performance on in-domain detection while outperforming models twice its size on out-of-domain detection. We release our code and trained models to assist in future research in domain-adaptive AI detection.

Paper Structure

This paper contains 41 sections, 4 equations, 3 figures, 16 tables.

Figures (3)

  • Figure 1: The Domain Gating network splits the input into a probability distribution over $N$ experts. The output is a weighted sum of the outputs of the top $k$ experts.
  • Figure 2: A graphical representation of the full Domain Gating Ensemble Network (DoGEN). Texts are given to each of $N$ domain-specific experts. Scores from each expert are ensembled using weights determined by the router network. This network is trained for domain classification and the top-$k$ experts are chosen for output.
  • Figure 3: Average gate weight versus expert AUROC on RAID. Pearson $\rho = 0.64,\;p < 0.03$.