As part of this (unintended) mini-series on MongoDB and indexing, I had written a little test to see if I could document performance gains through indexing. I used realworld data, albeit only 50,000 records, to query out a handful or documents (24 being the most).
Related posts:
Here is the code:
require 'test_helper'
class EncounterListingTest < Test::Unit::TestCase
context "Indexing" do
ProfileStats2 = Struct.new(:doctor_num, :count, :timing1, :timing2, :timing3)
should "profile assorted doctor patient retrievals" do
stats = []
doctor_nums = ["602490", "603324", "212043", "602938"]
doctor_nums.each_with_index do |doctor_num, i|
MongoMapper.database.collection('encounters').drop_indexes
show_indexes if i == 0
timing1 = (measure_performance(doctor_num) + measure_performance(doctor_num) + measure_performance(doctor_num))/3
MongoMapper.database.collection('encounters').drop_indexes
add_index([[:private_physician, 1]])
show_indexes if i == 0
timing2 = (measure_performance(doctor_num) + measure_performance(doctor_num) + measure_performance(doctor_num))/3
MongoMapper.database.collection('encounters').drop_indexes
add_index([[:private_physician,1], [:notify_physician,1], [:visible_count,1]])
show_indexes if i == 0
timing3 = (measure_performance(doctor_num) + measure_performance(doctor_num) + measure_performance(doctor_num))/3
n_count = Encounter.count(:private_physician => doctor_num, :notify_physician => 'Y', :visible_count.gt => 0)
stats << ProfileStats2.new(doctor_num, n_count, timing1, timing2, timing3)
end
File.open("test/performance/index_stats_results-#{Time.now.strftime("%d-%m-%Y")}.csv", 'w') do |f|
puts "%10s %6s %5s %5s %5s" % ["doctor", "count", "None", "Phys", "Phys/Ntfy/Vis"]
f.puts "doctor, count, None, Phys, PhysNtfyVis"
stats.each do |s|
results = "%10d, %6d, %5.3f, %5.3f, %5.3f" % [s.doctor_num, s.count, s.timing1, s.timing2, s.timing3]
puts results
f.puts "%d, %d, %5.3f, %5.3f, %5.3f" % [s.doctor_num, s.count, s.timing1, s.timing2, s.timing3]
end
end
end
end
private
def show_stats(stats)
stats.each do |s|
puts "%6d, %5.3f, %s" % [s.count, s.timing, s.index_type]
end
end
def measure_performance(doctor_num = "99602326")
start = Time.now
n_public = Encounter.where(:private_physician => doctor_num, :notify_physician => 'Y', :visible_count.gt => 0).all
delta = Time.now - start
delta
end
def show_indexes
puts "%s INDEXES %s" % ["*"*12, "*"*12]
Encounter.collection.index_information.collect { |index| puts " #{index[0]}" }
end
def add_index(new_index)
coll = MongoMapper.database.collection('encounters')
coll.drop_index(new_index) if !coll.index_information.detect { |index| index[0] == new_index }.nil?
Encounter.ensure_index(new_index)
end
end
Results:
The results are shown in the accompanying graph. Except for the query that returned 24 documents, the general trend was that 3 indexes were better than one. And one was w-a-a-a-y better than none (of course, you already knew that). The odd outlier being for count = 6, in that a single index did not perform as well as it did in all the other tests.

