Anatomy of a MongoDB Profiling Session

This particular application has been collecting data for months now, but hasn’t really had any users by design. At 33GB of data, pulling up a list of messages received was taking f-o-r-e-v-e-r!

So I decided to document how to go about and fix a running production system… Hope it helps.

Log into mongo console and turn on profiling (the ‘1’) to monitor slow queries. I entered >10 seconds, which really stinks (!). You should adjust it to suit your app’s definition of “slow”—maybe 500ms:

> db.setProfilingLevel(1,10000)
{ "was" : 0, "slowms" : 100, "ok" : 1 }

Next I went back to the webapp and executed the page request that exhibits the slow response …

Once the page returns, go in and look for any slow responses that the profiler logged:

> db.system.profile.find()
{
  "ts" : ISODate("2012-02-18T15:34:02.967Z"), 
  "op" : "command", 
  "command" : { "count" : "messages", 
    "query" : { "_type" : "HL7Message", "recv_app" : "CAREGIVER" }, 
      "fields" : null }, 
  "ntoreturn" : 1, 
  "responseLength" : 48, 
  "millis" : 119051, 
  "client" : "192.168.100.67", 
  "user" : "" 
}
{
  "ts" : ISODate("2012-02-18T15:35:51.704Z"),
  "op" : "query", 
  "query" : { "_type" : "HL7Message", "recv_app" : "CAREGIVER" },
  "ntoreturn" : 25,
  "ntoskip" : 791025,
  "nscanned" : 791051,
  "nreturned" : 25,
  "responseLength" : 49956,
  "millis" : 108720,
  "client" : "192.168.100.67",
  "user" : "" 
}

You can see there was a count query and a query for the data itself (we are using pagination). Sure enough, look here:

  • “ntoreturn” : 25,
  • “nscanned” : 791051,

 

Wow, that’s nasty… to return 25 records, we scanned 791,051! Gulp. Looks like a full table scan. Never a good thing (unless you have very small amounts of data).

Let’s see what sorts of indexes exist for the messages collection:

db.system.indexes.find( { ns: "production-alerts.messages" } );
{ "name" : "_id_", "key" : { "_id" : 1 }, "v" : 0 }
{ "v" : 1, "key" : { "created_at" : -1 }, "name" : "created_at_-1" }
{ "v" : 1, "key" : { "_type" : 1 }, "name" : "_type_1" }
{ "v" : 1, "key" : { "recv_app" : 1 }, "name" : "recv_app_1" }
{ "v" : 1, "key" : { "created_at" : -1, "recv_app" : 1 }, "name" : "created_at_-1_recv_app_1" }
{ "v" : 1, "key" : { "message_type" : 1 }, "name" : "message_type_1" }
{ "v" : 1, "key" : { "trigger_event" : 1 }, "name" : "trigger_event_1" }

Well, as expected, there is no index covering the multiple keys that we are searching on. So let’s add a multi-key index to match the query used by the controller!

db.messages.ensureIndex({_type:1, recv_app:1});

Now the app FLIES!! We dropped from 100+ seconds to 1.5 seconds (look at the “millis”) w00t!

db.messages.find({ _type : "HL7Message", recv_app : "CAREGIVER"}).explain();
> db.messages.find({ _type : "HL7Message", recv_app : "CAREGIVER"}).explain();
{
	"cursor" : "BtreeCursor _type_1_recv_app_1",
	"nscanned" : 791153,
	"nscannedObjects" : 791153,
	"n" : 791153,
	"millis" : 1546,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"isMultiKey" : false,
	"indexOnly" : false,
	"indexBounds" : {
		"_type" : [
			[
				"HL7Message",
				"HL7Message"
			]
		],
		"recv_app" : [
			[
				"CAREGIVER",
				"CAREGIVER"
			]
		]
	}
}

To prevent this sort of thing, you can consider adding indexes when you create new queries. But the best way to do this is to be empirical and know whether you should add the index through some testing. I’ll leave that for another day!

Exporting MongoMapper Objects to JSON

I wanted to export a MongoMapper document and it’s related documents as JSON — with embedded arrays for the collections. Invoking to_json did not seem to work perfectly, so I set about to discover what was going on.

Conclusion

If you use Embedded Documents for every associated document, the to_json method will work perfectly.

If you have normal Documents, you must override the as_json method to export the object “tree.”

Details

Here is a walk through of exporting mongo documents as JSON.

I created a simple Author class. And will use a simple test to show how to_json works:

describe "Author 1" do
  before :all do
    class Author
      include MongoMapper::Document
      key :name
      key :pen_name
    end
  end

  it "should output JSON" do
    p1 = Author.create(:name => "Ben Franklin", :pen_name => "Poor Richard")
    json = p1.to_json
    puts json
    json.should include "name"
    json.should include "Poor Richard"
  end
end

And we get what we expect:

{
  "books":[],
  "id":"4f316a4c8951a2eefe000001", 
  "name":"Ben Franklin",
  "pen_name":"Poor Richard"
}

Now let’s add a new Book document of the Embedded variety. Here we will assert that the Author JSON should include a list of Books:

describe "Author 2" do
  before :all do
    class Book
      include MongoMapper::EmbeddedDocument
      key :title
    end
    class Author
      include MongoMapper::Document
      many :books
    end
  end
  it "authors have books" do
    p1 = Author.create(:name => "Ben Franklin", :pen_name => "Poor Richard",
                       :books => [Book.new(:title => "Poor Richard's Almanac")])
    json = p1.to_json
    puts json
    json.should include "Poor Richard"
    json.should include "Almanac"
  end
end

And, sure enough, it works.

{
  "books":[{
    "id":"4f316a4c8951a2eefe000003", 
    "title":"Poor Richard's Almanac"}],
  "id":"4f316a4c8951a2eefe000004", 
  "name":"Ben Franklin", 
  "pen_name":"Poor Richard"
}

Let’s add a list of Interests to the Author class, this time as a normal document type (not embedded). Now we can test that the Author JSON has the expected Interest:

describe "Author 3" do
  before :all do
    class Interest
      include MongoMapper::Document
      key :title, String
    end
    class Book
      include MongoMapper::EmbeddedDocument
      key :title
    end
    class Author
      include MongoMapper::Document
      many :books
      many :interests
    end
  end
  it "should have interests" do
    p1 = Author.create(:name => "Ben Franklin", :pen_name => "Poor Richard",
                       :books => [Book.new(:title => "Poor Richard's Almanac")],
                       :interests => [Interest.create(:title => "Movies")])
    json = p1.to_json
    puts json
    json.should include "Poor Richard"
    json.should include "Almanac"
    json.should include "Movies" # Fails
  end
end

Whoa! No joy! Seems that the association to non-Embedded documents does not get automatically exported to the JSON.

{
  "books":[{
    "id":"4f316a4c8951a2eefe000006",
    "title":"Poor Richard's Almanac"}],
  "id":"4f316a4c8951a2eefe000007", 
  "name":"Ben Franklin", 
  "pen_name":"Poor Richard"
}

And we get a failed spec 🙁

# expected "{"books":[{"id":"4f316a4c8951a2eefe000006","title":"Poor Richard's Almanac"}],"id":"4f316a4c8951a2eefe000007","name":"Ben Franklin","pen_name":"Poor Richard"}" to include "Movies"

Turns out we can add a custom as_json implementation to the class that you want to export as JSON. The as_json is responsible for indicating which fields and collections should be included in the json.

describe "Author 4" do
  before :all do
    class Interest
      include MongoMapper::Document
      key :title, String

    end
    class Book
      include MongoMapper::EmbeddedDocument
      key :title
    end
    class Author
      include MongoMapper::Document
      many :books
      many :interests

      def as_json options={}
        {
            :name => self.name,
            :pen_name => self.pen_name,
            :books => self.books,
            :interests => self.interests
        }
      end
    end
  end
  it "should have interests in json" do
    p1 = Author.create(:name => "Ben Franklin", :pen_name => "Poor Richard",
                       :books => [Book.new(:title => "Poor Richard's Almanac")],
                       :interests => [Interest.create(:title => "Movies")])
    json = p1.to_json
    puts json
    json.should include "Poor Richard"
    json.should include "Almanac"
    json.should include "Movies"
  end
end

And we have Books and Interests. Success!

{
  "name":"Ben Franklin", 
  "pen_name":"Poor Richard",
  "books":[{
    "id":"4f31782a8951a2f267000002", 
    "title":"Poor Richard's Almanac"}], 
  "interests":[{
    "author_id":"4f31782a8951a2f267000003", 
    "id":"4f31782a8951a2f267000001",
    "title":"Movies"}]
}