Scaling the Scaling Logic: Agentic Meta-Synthesis of Logic Reasoning

Bowen Liu; Zhi Wu; Runquan Xie; Zhanhui Kang; Jia Li

Scaling the Scaling Logic: Agentic Meta-Synthesis of Logic Reasoning

Bowen Liu, Zhi Wu, Runquan Xie, Zhanhui Kang, Jia Li

TL;DR

SSLogic is proposed, an agentic meta-synthesis framework that scales at the task-family level by iteratively synthesizing and repairing executable Generator--Validator program pairs in a closed Generate--Validate--Repair loop, enabling continuous family evolution with controllable difficulty.

Abstract

Scaling verifiable training signals remains a key bottleneck for Reinforcement Learning from Verifiable Rewards (RLVR). Logical reasoning is a natural substrate: constraints are formal and answers are programmatically checkable. However, prior synthesis pipelines either depend on expert-written code or operate within fixed templates/skeletons, which limits growth largely to instance-level perturbations. We propose SSLogic, an agentic meta-synthesis framework that scales at the task-family level by iteratively synthesizing and repairing executable Generator--Validator program pairs in a closed Generate--Validate--Repair loop, enabling continuous family evolution with controllable difficulty. To ensure reliability, we introduce a Multi-Gate Validation Protocol that combines multi-strategy consistency checks with Adversarial Blind Review, where independent agents must solve instances by writing and executing code to filter ambiguous or ill-posed tasks. Starting from 400 seed families, two evolution rounds expand to 953 families and 21,389 verifiable instances (from 5,718). Training on SSLogic-evolved data yields consistent gains over the seed baseline at matched training steps, improving SynLogic by +5.2, BBEH by +1.4, AIME25 by +3.0, and Brumo25 by +3.7.

Scaling the Scaling Logic: Agentic Meta-Synthesis of Logic Reasoning

TL;DR

Abstract

Paper Structure (80 sections, 4 equations, 27 figures, 17 tables)

This paper contains 80 sections, 4 equations, 27 figures, 17 tables.

Introduction
Preliminaries
Problem Setting
Synthesis Formalism
Scaling the Scaling Logic
Agent Framework
Phase I: Context-Aware Specification Synthesis
Phase II: Multi-Gate Validation Protocol
Phase III: Feedback-Driven Finalization
Impact of SSLogic on Reinforcement Learning
Setup
Synthetic Data Consistency
Main Results
Isolating the Effect of Dataset Size
Cross-Domain Generalization
...and 65 more sections

Figures (27)

Figure 1: Paradigm Shifts in Logic Data Generation: From Manual Curation to Agentic Meta-Synthesis. Left: Traditional Manual Curation focuses on Task/QA pairs, where quality control and feedback rely heavily on humans. Middle: Code Synthesis introduces executable Generators/Validators, achieving partial automation but still requiring manual oversight. Right: Our Agentic Meta-Synthesis enables fully automatic, end-to-end data production. Agents iteratively generate and validate task families (Generator + Validator) and instances, realizing the path from Manual $\rightarrow$ Semi-Automatic $\rightarrow$ Full-Automatic construction (Scaling the Scaling Logic).
Figure 2: Overview of the Multi-Gate Agentic Meta-Synthesis Framework. The Main Agent operates in a three-phase closed loop: Task Synthesis (Phase I), screening via Quality Agent Gates and Consensus-based Validation (including Blind Review) (Phase II), and Abductive Debugging for failures with Experience Updates, finally delivering Generators/Validators, templates, and data (Phase III).
Figure 3: Evolution of reflection-like token frequency across different training settings.
Figure 4: Average response length dynamics during training.
Figure 5: Difficulty controllability. Pass@1 accuracy between Seed and Evolved tasks at $D \in \{5, 7, 10\}$ on DeepSeek-V3.1-Terminus and Doubao-1.6-Thinking. The curves decrease monotonically and closely track each other, with error bars shown.
...and 22 more figures

Scaling the Scaling Logic: Agentic Meta-Synthesis of Logic Reasoning

TL;DR

Abstract

Scaling the Scaling Logic: Agentic Meta-Synthesis of Logic Reasoning

Authors

TL;DR

Abstract

Table of Contents

Figures (27)