Table of Contents
Fetching ...

Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation

Abdelrahman Abdallah, Bhawna Piryani, Jamshid Mozafari, Mohammed Ali, Adam Jatowt

TL;DR

Rankify addresses fragmentation in retrieval, re-ranking, and retrieval-augmented generation by delivering a unified, modular Python toolkit. It combines prebuilt corpora, diverse retrievers and re-rankers, and seamless RAG integration within a cohesive framework to enable end-to-end experimentation and benchmarking. The paper demonstrates competitive retrieval performance and strong generation results across QA tasks, validating Rankify's utility for information retrieval and knowledge-grounded generation. This framework promises easier reproducibility and extensibility for researchers building retrieval-augmented systems.

Abstract

Retrieval, re-ranking, and retrieval-augmented generation (RAG) are critical components of modern applications in information retrieval, question answering, or knowledge-based text generation. However, existing solutions are often fragmented, lacking a unified framework that easily integrates these essential processes. The absence of a standardized implementation, coupled with the complexity of retrieval and re-ranking workflows, makes it challenging for researchers to compare and evaluate different approaches in a consistent environment. While existing toolkits such as Rerankers and RankLLM provide general-purpose reranking pipelines, they often lack the flexibility required for fine-grained experimentation and benchmarking. In response to these challenges, we introduce Rankify, a powerful and modular open-source toolkit designed to unify retrieval, re-ranking, and RAG within a cohesive framework. Rankify supports a wide range of retrieval techniques, including dense and sparse retrievers, while incorporating state-of-the-art re-ranking models to enhance retrieval quality. Additionally, Rankify includes a collection of pre-retrieved datasets to facilitate benchmarking, available at Huggingface (https://huggingface.co/datasets/abdoelsayed/reranking-datasets-light). To encourage adoption and ease of integration, we provide comprehensive documentation (http://rankify.readthedocs.io/), an open-source implementation on GitHub (https://github.com/DataScienceUIBK/rankify), and a PyPI package for easy installation (https://pypi.org/project/rankify/). As a unified and lightweight framework, Rankify allows researchers and practitioners to advance retrieval and re-ranking methodologies while ensuring consistency, scalability, and ease of use.

Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation

TL;DR

Rankify addresses fragmentation in retrieval, re-ranking, and retrieval-augmented generation by delivering a unified, modular Python toolkit. It combines prebuilt corpora, diverse retrievers and re-rankers, and seamless RAG integration within a cohesive framework to enable end-to-end experimentation and benchmarking. The paper demonstrates competitive retrieval performance and strong generation results across QA tasks, validating Rankify's utility for information retrieval and knowledge-grounded generation. This framework promises easier reproducibility and extensibility for researchers building retrieval-augmented systems.

Abstract

Retrieval, re-ranking, and retrieval-augmented generation (RAG) are critical components of modern applications in information retrieval, question answering, or knowledge-based text generation. However, existing solutions are often fragmented, lacking a unified framework that easily integrates these essential processes. The absence of a standardized implementation, coupled with the complexity of retrieval and re-ranking workflows, makes it challenging for researchers to compare and evaluate different approaches in a consistent environment. While existing toolkits such as Rerankers and RankLLM provide general-purpose reranking pipelines, they often lack the flexibility required for fine-grained experimentation and benchmarking. In response to these challenges, we introduce Rankify, a powerful and modular open-source toolkit designed to unify retrieval, re-ranking, and RAG within a cohesive framework. Rankify supports a wide range of retrieval techniques, including dense and sparse retrievers, while incorporating state-of-the-art re-ranking models to enhance retrieval quality. Additionally, Rankify includes a collection of pre-retrieved datasets to facilitate benchmarking, available at Huggingface (https://huggingface.co/datasets/abdoelsayed/reranking-datasets-light). To encourage adoption and ease of integration, we provide comprehensive documentation (http://rankify.readthedocs.io/), an open-source implementation on GitHub (https://github.com/DataScienceUIBK/rankify), and a PyPI package for easy installation (https://pypi.org/project/rankify/). As a unified and lightweight framework, Rankify allows researchers and practitioners to advance retrieval and re-ranking methodologies while ensuring consistency, scalability, and ease of use.

Paper Structure

This paper contains 16 sections, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Rankify logo.
  • Figure 2: An overview of the Rankify pipeline, demonstrating its dual capability for document retrieval. Users can interact with the system either by providing a query to retrieve relevant documents in real-time or by leveraging pre-retrieved datasets already indexed by the framework. The process starts with Pre-Retrieved Datasets & Corpus Indexing, where documents are indexed using both dense (e.g., DPR) and sparse (e.g., BM25) retrievers across corpus like Wikipedia and MS MARCO. Next, in the Retrieval & Re-Ranking stage, the system retrieves candidate documents using dense, or sparse retrieval methods and re-ranks them with pointwise pairwise, or listwise models powered by large language models (LLMs). Finally, the Generator stage applies prompt augmentation and uses models like Fusion-in-Decoder (FiD) to generate accurate and contextually informed answers.
  • Figure 3: The architecture of the Rankify, showing the interplay between its core modules: Datasets, Retrievers, Re-Rankers, and RAG Evaluation. Each module operates independently while seamlessly integrating with others, enabling end-to-end retrieval and ranking workflows.
  • Figure 4: Exact Match (EM) for BM25, Contriever, and DPR retrievers across three datasets (NQ, TriviaQA, WebQ) using various language models (LLaMA V3/V3.1, Gemma 2B/9B, LLaMA 2 13B, and Mistral 7B).