Table of Contents
Fetching ...

SearchGym: A Modular Infrastructure for Cross-Platform Benchmarking and Hybrid Search Orchestration

Jerome Tze-Hou Hsu

TL;DR

SearchGym, a modular infrastructure designed for cross-platform benchmarking and hybrid search orchestration, reveals a design tension between generalizability and optimizability, presenting the potential where engineering optimization may serve as a tool for uncovering the causal mechanisms inherent in information retrieval across heterogeneous domains.

Abstract

The rapid growth of Retrieval-Augmented Generation (RAG) has created a proliferation of toolkits, yet a fundamental gap remains between experimental prototypes and robust, production-ready systems. We present SearchGym, a modular infrastructure designed for cross-platform benchmarking and hybrid search orchestration. Unlike existing model-centric frameworks, SearchGym decouples data representation, embedding strategies, and retrieval logic into stateful abstractions: Dataset, VectorSet, and App. This separation enables a Compositional Config Algebra, allowing designers to synthesize entire systems from hierarchical configurations while ensuring perfect reproducibility. Moreover, we analyze the "Top-$k$ Cognizance" in hybrid retrieval pipelines, demonstrating that the optimal sequence of semantic ranking and structured filtering is highly dependent on filter strength. Evaluated on the LitSearch expert-annotated benchmark, SearchGym achieves a 70% Top-100 retrieval rate. SearchGym reveals a design tension between generalizability and optimizability, presenting the potential where engineering optimization may serve as a tool for uncovering the causal mechanisms inherent in information retrieval across heterogeneous domains. An open-source implementation of SearchGym is available at: https://github.com/JeromeTH/search-gym

SearchGym: A Modular Infrastructure for Cross-Platform Benchmarking and Hybrid Search Orchestration

TL;DR

SearchGym, a modular infrastructure designed for cross-platform benchmarking and hybrid search orchestration, reveals a design tension between generalizability and optimizability, presenting the potential where engineering optimization may serve as a tool for uncovering the causal mechanisms inherent in information retrieval across heterogeneous domains.

Abstract

The rapid growth of Retrieval-Augmented Generation (RAG) has created a proliferation of toolkits, yet a fundamental gap remains between experimental prototypes and robust, production-ready systems. We present SearchGym, a modular infrastructure designed for cross-platform benchmarking and hybrid search orchestration. Unlike existing model-centric frameworks, SearchGym decouples data representation, embedding strategies, and retrieval logic into stateful abstractions: Dataset, VectorSet, and App. This separation enables a Compositional Config Algebra, allowing designers to synthesize entire systems from hierarchical configurations while ensuring perfect reproducibility. Moreover, we analyze the "Top- Cognizance" in hybrid retrieval pipelines, demonstrating that the optimal sequence of semantic ranking and structured filtering is highly dependent on filter strength. Evaluated on the LitSearch expert-annotated benchmark, SearchGym achieves a 70% Top-100 retrieval rate. SearchGym reveals a design tension between generalizability and optimizability, presenting the potential where engineering optimization may serve as a tool for uncovering the causal mechanisms inherent in information retrieval across heterogeneous domains. An open-source implementation of SearchGym is available at: https://github.com/JeromeTH/search-gym
Paper Structure (18 sections, 5 figures, 1 table)

This paper contains 18 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: End-to-end pipeline visualization of our hybrid search system.
  • Figure 2: Data Schema with separation of static built and dynamic loading.
  • Figure 3: Key stored data states of SearchGym.
  • Figure 4: Overview of dynamic system construction from static type system.
  • Figure 5: System evaluation result on LitSearch.