DRAMA: Domain Retrieval using Adaptive Module Allocation
Pranav Kasela, Marco Braga, Ophir Frieder, Nazli Goharian, Gabriella Pasi, Raffaele Perego
TL;DR
DRAMA addresses the high energy and storage costs of multi-domain neural information retrieval by introducing domain-specific adapters coupled with a query-driven gating mechanism. A frozen backbone is augmented with lightweight adapters A_n for each domain, trained via knowledge distillation from domain-specific teachers, while a separate gating function selects the most relevant adapter at inference. Empirical results across dense and re-ranking settings on academic search and community Q&A tasks show that DRAMA achieves retrieval performance on par with domain-specific baselines while reducing parameter counts and energy use by more than 75% compared to multi-domain ensembles; DRAMA also demonstrates robust zero-shot generalization to unseen domains. The approach offers a practical, scalable solution for energy-aware neural IR, enabling efficient deployment and easier ecosystem expansion as new domains emerge, without full model retraining.
Abstract
Neural models are increasingly used in Web-scale Information Retrieval (IR). However, relying on these models introduces substantial computational and energy requirements, leading to increasing attention toward their environmental cost and the sustainability of large-scale deployments. While neural IR models deliver high retrieval effectiveness, their scalability is constrained in multi-domain scenarios, where training and maintaining domain-specific models is inefficient and achieving robust cross-domain generalisation within a unified model remains difficult. This paper introduces DRAMA (Domain Retrieval using Adaptive Module Allocation), an energy- and parameter-efficient framework designed to reduce the environmental footprint of neural retrieval. DRAMA integrates domain-specific adapter modules with a dynamic gating mechanism that selects the most relevant domain knowledge for each query. New domains can be added efficiently through lightweight adapter training, avoiding full model retraining. We evaluate DRAMA on multiple Web retrieval benchmarks covering different domains. Our extensive evaluation shows that DRAMA achieves comparable effectiveness to domain-specific models while using only a fraction of their parameters and computational resources. These findings show that energy-aware model design can significantly improve scalability and sustainability in neural IR.
