MongoDB Index Performance

As part of this (unintended) mini-series on MongoDB and indexing, I had written a little test to see if I could document performance gains through indexing. I used realworld data, albeit only 50,000 records, to query out a handful or documents (24 being the most).

Related posts:

Here is the code:

require 'test_helper'

class EncounterListingTest < Test::Unit::TestCase

  context "Indexing" do
    ProfileStats2 = Struct.new(:doctor_num, :count, :timing1, :timing2, :timing3)

    should "profile assorted doctor patient retrievals" do
      stats = []
      doctor_nums = ["602490", "603324", "212043", "602938"]
      doctor_nums.each_with_index do |doctor_num, i|
        MongoMapper.database.collection('encounters').drop_indexes
        show_indexes if i == 0
        timing1 = (measure_performance(doctor_num) + measure_performance(doctor_num) + measure_performance(doctor_num))/3

        MongoMapper.database.collection('encounters').drop_indexes
        add_index([[:private_physician, 1]])
        show_indexes if i == 0
        timing2 = (measure_performance(doctor_num) + measure_performance(doctor_num) + measure_performance(doctor_num))/3

        MongoMapper.database.collection('encounters').drop_indexes
        add_index([[:private_physician,1], [:notify_physician,1], [:visible_count,1]])
        show_indexes if i == 0
        timing3 = (measure_performance(doctor_num) + measure_performance(doctor_num) + measure_performance(doctor_num))/3

        n_count = Encounter.count(:private_physician => doctor_num, :notify_physician => 'Y', :visible_count.gt => 0)
        stats << ProfileStats2.new(doctor_num, n_count, timing1, timing2, timing3)

      end

      File.open("test/performance/index_stats_results-#{Time.now.strftime("%d-%m-%Y")}.csv", 'w') do |f|
        puts "%10s  %6s  %5s  %5s  %5s" % ["doctor", "count", "None", "Phys", "Phys/Ntfy/Vis"]
        f.puts "doctor, count, None, Phys, PhysNtfyVis"
        stats.each do |s|
          results = "%10d, %6d, %5.3f, %5.3f, %5.3f" % [s.doctor_num, s.count, s.timing1, s.timing2, s.timing3]
          puts results
          f.puts "%d, %d, %5.3f, %5.3f, %5.3f" % [s.doctor_num, s.count, s.timing1, s.timing2, s.timing3]
        end
      end

    end

  end

  private
  def show_stats(stats)
    stats.each do |s|
      puts "%6d, %5.3f, %s" % [s.count, s.timing, s.index_type]
    end
  end

  def measure_performance(doctor_num = "99602326")
    start = Time.now
    n_public = Encounter.where(:private_physician => doctor_num, :notify_physician => 'Y', :visible_count.gt => 0).all
    delta = Time.now - start
    delta
  end

  def show_indexes
    puts "%s INDEXES %s" % ["*"*12, "*"*12]
    Encounter.collection.index_information.collect { |index| puts "    #{index[0]}" }
  end

  def add_index(new_index)
    coll = MongoMapper.database.collection('encounters')
    coll.drop_index(new_index) if !coll.index_information.detect { |index| index[0] == new_index }.nil?
    Encounter.ensure_index(new_index)
  end

end

Results:

The effect of adding indexes on query performance

The results are shown in the accompanying graph. Except for the query that returned 24 documents, the general trend was that 3 indexes were better than one. And one was w-a-a-a-y better than none (of course, you already knew that). The odd outlier being for count = 6, in that a single index did not perform as well as it did in all the other tests.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.