Table of Contents
Fetching ...

First Past the Post: Evaluating Query Optimization in MongoDB

Dawei Tao, Enqi Liu, Sidath Randeni Kadupitige, Michael Cahill, Alan Fekete, Uwe Röhm

TL;DR

It is concluded that FPTP has a preference bias, choosing index scans even in many cases where collection scans would run faster, which can lead MongoDB to choose a plan with more than twice the runtime compared to the optimal plan for the query.

Abstract

Query optimization is crucial for every database management system (DBMS) to enable fast execution of declarative queries. Most DBMS designs include cost-based query optimization. However, MongoDB implements a different approach to choose an execution plan that we call "first past the post" (FPTP) query optimization. FPTP does not estimate costs for each execution plan, but rather partially executes the alternative plans in a round-robin race and observes the work done by each relative to the number of records returned. In this paper, we analyze the effectiveness of MongoDB's FPTP query optimizer. We see whether the optimizer chooses the best execution plan among the alternatives and measure how the chosen plan compares to the optimal plan. We also show how to visualize the effectiveness and identify situations where the MongoDB 7.0.1 query optimizer chooses suboptimal query plans. Through experiments, we conclude that FPTP has a preference bias, choosing index scans even in many cases where collection scans would run faster. We identify the reasons for the preference bias, which can lead MongoDB to choose a plan with more than twice the runtime compared to the optimal plan for the query.

First Past the Post: Evaluating Query Optimization in MongoDB

TL;DR

It is concluded that FPTP has a preference bias, choosing index scans even in many cases where collection scans would run faster, which can lead MongoDB to choose a plan with more than twice the runtime compared to the optimal plan for the query.

Abstract

Query optimization is crucial for every database management system (DBMS) to enable fast execution of declarative queries. Most DBMS designs include cost-based query optimization. However, MongoDB implements a different approach to choose an execution plan that we call "first past the post" (FPTP) query optimization. FPTP does not estimate costs for each execution plan, but rather partially executes the alternative plans in a round-robin race and observes the work done by each relative to the number of records returned. In this paper, we analyze the effectiveness of MongoDB's FPTP query optimizer. We see whether the optimizer chooses the best execution plan among the alternatives and measure how the chosen plan compares to the optimal plan. We also show how to visualize the effectiveness and identify situations where the MongoDB 7.0.1 query optimizer chooses suboptimal query plans. Through experiments, we conclude that FPTP has a preference bias, choosing index scans even in many cases where collection scans would run faster. We identify the reasons for the preference bias, which can lead MongoDB to choose a plan with more than twice the runtime compared to the optimal plan for the query.
Paper Structure (31 sections, 3 equations, 9 figures, 1 table, 2 algorithms)

This paper contains 31 sections, 3 equations, 9 figures, 1 table, 2 algorithms.

Figures (9)

  • Figure 1: MongoDB query processing workflow.
  • Figure 2: Logical flow of MongoDB query optimizer.
  • Figure 3: Visual mapping of execution plans for different selectivities.
  • Figure 4: Effectiveness of MongoDB's query optimizer with conjunctive filter queries and both attributes indexed.
  • Figure 5: Effectiveness of MongoDB's query optimizer with conjunctive filter queries and only one attribute indexed.
  • ...and 4 more figures