AI Co-Scientist for Ranking: Discovering Novel Search Ranking Models alongside LLM-based AI Agents with Cloud Computing Access

Liwei Wu; Cho-Jui Hsieh

AI Co-Scientist for Ranking: Discovering Novel Search Ranking Models alongside LLM-based AI Agents with Cloud Computing Access

Liwei Wu, Cho-Jui Hsieh

Abstract

Recent advances in AI agents for software engineering and scientific discovery have demonstrated remarkable capabilities, yet their application to developing novel ranking models in commercial search engines remains unexplored. In this paper, we present an AI Co-Scientist framework that automates the full search ranking research pipeline: from idea generation to code implementation and GPU training job scheduling with expert in the loop. Our approach strategically employs single-LLM agents for routine tasks while leveraging multi-LLM consensus agents (GPT 5.2, Gemini Pro 3, and Claude Opus 4.5) for challenging phases such as results analysis and idea generation. To our knowledge, this is the first study in the ranking community to utilize an AI Co-Scientist framework for algorithmic research. We demonstrate that this framework discovered a novel technique for handling sequence features, with all model enhancements produced automatically, yielding substantial offline performance improvements. Our findings suggest that AI systems can discover ranking architectures comparable to those developed by human experts while significantly reducing routine research workloads.

AI Co-Scientist for Ranking: Discovering Novel Search Ranking Models alongside LLM-based AI Agents with Cloud Computing Access

Abstract

Paper Structure (23 sections, 4 figures, 2 tables)

This paper contains 23 sections, 4 figures, 2 tables.

Introduction
Related Work
AI Agents for Scientific Discovery
Scalable Transformer Architecture for Ranking
Methodology
Search Ranking Problem Formulation
AI Co-Scientist for Ranking
Memory Module
Idea Generation Module
Code Implementation Module
Experimentation Module
Results Analysis Module
Experiments and Findings
Discovering Novel Search Ranking Model
Designing Training Dynamics Tricks
...and 8 more sections

Figures (4)

Figure 1: AI agent scientific discovery workflow
Figure 2: AI Co-Scientist Performance Gain against H20 GPU Hours, where 0.1% gain in this eval metric is statistically significant and usually translates into 0.1% lift in Conversion Rate and millions of dollars in online experiments based on previous experiences.
Figure 3: Comparison of Transformer designs
Figure 4: AI Co-Scientist planning more routine tasks in Section \ref{['sec:code']} and \ref{['sec:exp']}

AI Co-Scientist for Ranking: Discovering Novel Search Ranking Models alongside LLM-based AI Agents with Cloud Computing Access

Abstract

AI Co-Scientist for Ranking: Discovering Novel Search Ranking Models alongside LLM-based AI Agents with Cloud Computing Access

Authors

Abstract

Table of Contents

Figures (4)