Table of Contents
Fetching ...

Neural Models for Information Retrieval

Bhaskar Mitra, Nick Craswell

TL;DR

The paper surveys neural information retrieval methods, framing them against traditional L2R and highlighting their data-hungry nature. It systematically covers text representations (local vs. distributed), embedding-based matching, and deep architectures (Siamese, interaction-based, and lexical/semantic hybrids) with concrete models such as DESM, CDSSM, Duet, and WMD. It discusses both long-document ranking and short-text matching, analyzes training data regimes (supervised, unsupervised, semi-supervised), and reviews neural toolkits and evaluation considerations. The work emphasizes balancing lexical precision with semantic coverage, and it outlines future directions including robustness, benchmarks, interpretability, and cross-pollination with NLP advances to drive practical IR impact.

Abstract

Neural ranking models for information retrieval (IR) use shallow or deep neural networks to rank search results in response to a query. Traditional learning to rank models employ machine learning techniques over hand-crafted IR features. By contrast, neural models learn representations of language from raw text that can bridge the gap between query and document vocabulary. Unlike classical IR models, these new machine learning based approaches are data-hungry, requiring large scale training data before they can be deployed. This tutorial introduces basic concepts and intuitions behind neural IR models, and places them in the context of traditional retrieval models. We begin by introducing fundamental concepts of IR and different neural and non-neural approaches to learning vector representations of text. We then review shallow neural IR methods that employ pre-trained neural term embeddings without learning the IR task end-to-end. We introduce deep neural networks next, discussing popular deep architectures. Finally, we review the current DNN models for information retrieval. We conclude with a discussion on potential future directions for neural IR.

Neural Models for Information Retrieval

TL;DR

The paper surveys neural information retrieval methods, framing them against traditional L2R and highlighting their data-hungry nature. It systematically covers text representations (local vs. distributed), embedding-based matching, and deep architectures (Siamese, interaction-based, and lexical/semantic hybrids) with concrete models such as DESM, CDSSM, Duet, and WMD. It discusses both long-document ranking and short-text matching, analyzes training data regimes (supervised, unsupervised, semi-supervised), and reviews neural toolkits and evaluation considerations. The work emphasizes balancing lexical precision with semantic coverage, and it outlines future directions including robustness, benchmarks, interpretability, and cross-pollination with NLP advances to drive practical IR impact.

Abstract

Neural ranking models for information retrieval (IR) use shallow or deep neural networks to rank search results in response to a query. Traditional learning to rank models employ machine learning techniques over hand-crafted IR features. By contrast, neural models learn representations of language from raw text that can bridge the gap between query and document vocabulary. Unlike classical IR models, these new machine learning based approaches are data-hungry, requiring large scale training data before they can be deployed. This tutorial introduces basic concepts and intuitions behind neural IR models, and places them in the context of traditional retrieval models. We begin by introducing fundamental concepts of IR and different neural and non-neural approaches to learning vector representations of text. We then review shallow neural IR methods that employ pre-trained neural term embeddings without learning the IR task end-to-end. We introduce deep neural networks next, discussing popular deep architectures. Finally, we review the current DNN models for information retrieval. We conclude with a discussion on potential future directions for neural IR.

Paper Structure

This paper contains 27 sections, 35 equations, 24 figures, 3 tables.

Figures (24)

  • Figure 1: The percentage of neural IR papers at the ACM SIGIR conference---as determined by a manual inspection of the paper titles---shows a clear trend in the growing popularity of the field.
  • Figure 2: A Log-Log plot of frequency versus rank for query impressions and document clicks in the AOL query logs Pass:2006. The plots highlight that these quantities follow a Zipfian distribution.
  • Figure 3: Distribution of Wikipedia featured articles by document length (in bytes) as of June 30, 2014. Source: https://en.wikipedia.org/wiki/Wikipedia:Featured_articles/By_length.
  • Figure 4: Document ranking typically involves a query and a document representation steps, followed by a matching stage. Neural models can be useful either for generating good representations or in estimating relevance, or both.
  • Figure 5: Examples of different neural approaches to IR. In (a) and (b) the neural network is only used at the point of matching, whereas in (c) the focus is on learning effective representations of text using neural methods. Neural models can also be used to expand or augment the query before applying traditional IR techniques, as shown in (d).
  • ...and 19 more figures