As part of this (unintended) mini-series on MongoDB and indexing, I had written a little test to see if I could document performance gains through indexing. I used realworld data, albeit only 50,000 records, to query out a handful or documents (24 being the most).
Related posts:
Here is the code:
require 'test_helper' class EncounterListingTest < Test::Unit::TestCase context "Indexing" do ProfileStats2 = Struct.new(:doctor_num, :count, :timing1, :timing2, :timing3) should "profile assorted doctor patient retrievals" do stats = [] doctor_nums = ["602490", "603324", "212043", "602938"] doctor_nums.each_with_index do |doctor_num, i| MongoMapper.database.collection('encounters').drop_indexes show_indexes if i == 0 timing1 = (measure_performance(doctor_num) + measure_performance(doctor_num) + measure_performance(doctor_num))/3 MongoMapper.database.collection('encounters').drop_indexes add_index([[:private_physician, 1]]) show_indexes if i == 0 timing2 = (measure_performance(doctor_num) + measure_performance(doctor_num) + measure_performance(doctor_num))/3 MongoMapper.database.collection('encounters').drop_indexes add_index([[:private_physician,1], [:notify_physician,1], [:visible_count,1]]) show_indexes if i == 0 timing3 = (measure_performance(doctor_num) + measure_performance(doctor_num) + measure_performance(doctor_num))/3 n_count = Encounter.count(:private_physician => doctor_num, :notify_physician => 'Y', :visible_count.gt => 0) stats << ProfileStats2.new(doctor_num, n_count, timing1, timing2, timing3) end File.open("test/performance/index_stats_results-#{Time.now.strftime("%d-%m-%Y")}.csv", 'w') do |f| puts "%10s %6s %5s %5s %5s" % ["doctor", "count", "None", "Phys", "Phys/Ntfy/Vis"] f.puts "doctor, count, None, Phys, PhysNtfyVis" stats.each do |s| results = "%10d, %6d, %5.3f, %5.3f, %5.3f" % [s.doctor_num, s.count, s.timing1, s.timing2, s.timing3] puts results f.puts "%d, %d, %5.3f, %5.3f, %5.3f" % [s.doctor_num, s.count, s.timing1, s.timing2, s.timing3] end end end end private def show_stats(stats) stats.each do |s| puts "%6d, %5.3f, %s" % [s.count, s.timing, s.index_type] end end def measure_performance(doctor_num = "99602326") start = Time.now n_public = Encounter.where(:private_physician => doctor_num, :notify_physician => 'Y', :visible_count.gt => 0).all delta = Time.now - start delta end def show_indexes puts "%s INDEXES %s" % ["*"*12, "*"*12] Encounter.collection.index_information.collect { |index| puts " #{index[0]}" } end def add_index(new_index) coll = MongoMapper.database.collection('encounters') coll.drop_index(new_index) if !coll.index_information.detect { |index| index[0] == new_index }.nil? Encounter.ensure_index(new_index) end end
Results:
The results are shown in the accompanying graph. Except for the query that returned 24 documents, the general trend was that 3 indexes were better than one. And one was w-a-a-a-y better than none (of course, you already knew that). The odd outlier being for count = 6, in that a single index did not perform as well as it did in all the other tests.