Table of Contents
Fetching ...

MAGNET: Autonomous Expert Model Generation via Decentralized Autoresearch and BitNet Training

Yongwan Kim, Sungchul Park

Abstract

We present MAGNET (Model Autonomously Growing Network), a decentralized system for autonomous generation, training, and serving of domain-expert language models across commodity hardware. MAGNET integrates four components: (1) autoresearch, an autonomous ML research pipeline that automates dataset generation, hyperparameter exploration, evaluation, and error-driven iteration; (2) BitNet b1.58 ternary training, enabling CPU-native inference via bitnet.cpp without GPU hardware; (3) DiLoCo-based distributed merging for communication-efficient aggregation of domain specialists; and (4) on-chain contribution tracking on the HOOTi EVM chain. We validate autoresearch through three case studies: video safety classification (balanced accuracy 0.9287 to 0.9851), cryptocurrency directional prediction (41% to 54.9% hit rate), and BitNet hyperparameter optimization (10-phase sweep, -16.7% validation loss).

MAGNET: Autonomous Expert Model Generation via Decentralized Autoresearch and BitNet Training

Abstract

We present MAGNET (Model Autonomously Growing Network), a decentralized system for autonomous generation, training, and serving of domain-expert language models across commodity hardware. MAGNET integrates four components: (1) autoresearch, an autonomous ML research pipeline that automates dataset generation, hyperparameter exploration, evaluation, and error-driven iteration; (2) BitNet b1.58 ternary training, enabling CPU-native inference via bitnet.cpp without GPU hardware; (3) DiLoCo-based distributed merging for communication-efficient aggregation of domain specialists; and (4) on-chain contribution tracking on the HOOTi EVM chain. We validate autoresearch through three case studies: video safety classification (balanced accuracy 0.9287 to 0.9851), cryptocurrency directional prediction (41% to 54.9% hit rate), and BitNet hyperparameter optimization (10-phase sweep, -16.7% validation loss).

Paper Structure

This paper contains 51 sections, 6 equations, 4 figures, 10 tables, 1 algorithm.

Figures (4)

  • Figure 1: MAGNET four-pillar architecture. Arrows indicate data flow between subsystems.
  • Figure 2: Zevor autoresearch progression. 9 versions, $\sim$5,000 configurations. Each version addresses specific failure modes identified in the previous version's error analysis.
  • Figure 3: StockClaw hit rate progression across autoresearch versions. Data scale: 504 $\rightarrow$ 11,904 $\rightarrow$ 77,771 samples.
  • Figure 4: Genkidama autoresearch v5: best validation loss per phase. Each value is the best within that phase's sweep, not the cumulative best-so-far across phases. Phase 7 (context length) shows the largest single improvement. Phase 10 validates top-3 configs at 4$\times$ training steps.