How does the order of compound indexes matter in MongoDB performance-wise?

Redsandro,

You must consider Index Cardinality and Selectivity.


1. Index Cardinality

The index cardinality refers to how many possible values there are for a field. The field sex only has two possible values. It has a very low cardinality. Other fields such as names, usernames, phone numbers, emails, etc. will have a more unique value for every document in the collection, which is considered high cardinality.

  • Greater Cardinality

    The greater the cardinality of a field the more helpful an index will be, because indexes narrow the search space, making it a much smaller set.

    If you have an index on sex and you are looking for men named John. You would only narrow down the result space by approximately %50 if you indexed by sex first. Conversely if you indexed by name, you would immediately narrow down the result set to a minute fraction of users named John, then you would refer to those documents to check the gender.

  • Rule of Thumb

    Try to create indexes on high-cardinality keys or put high-cardinality keys first in the compound index. You can read more about it in the section on compound indexes in the book:

    MongoDB The Definitive Guide


2. Selectivity

Also, you want to use indexes selectively and write queries that limit the number of possible documents with the indexed field. To keep it simple, consider the following collection. If your index is {name:1}, If you run the query { name: "John", sex: "male"}. You will have to scan 1 document. Because you allowed MongoDB to be selective.

{_id:ObjectId(),name:"John",sex:"male"}
{_id:ObjectId(),name:"Rich",sex:"male"}
{_id:ObjectId(),name:"Mose",sex:"male"}
{_id:ObjectId(),name:"Sami",sex:"male"}
{_id:ObjectId(),name:"Cari",sex:"female"}
{_id:ObjectId(),name:"Mary",sex:"female"}

Consider the following collection. If your index is {sex:1}, If you run the query {sex: "male", name: "John"}. You will have to scan 4 documents.

{_id:ObjectId(),name:"John",sex:"male"}
{_id:ObjectId(),name:"Rich",sex:"male"}
{_id:ObjectId(),name:"Mose",sex:"male"}
{_id:ObjectId(),name:"Sami",sex:"male"}
{_id:ObjectId(),name:"Cari",sex:"female"}
{_id:ObjectId(),name:"Mary",sex:"female"}

Imagine the possible differences on a larger data set.


A little explanation of Compound Indexes

It’s easy to make the wrong assumption about Compound Indexes. According to MongoDB docs on Compound Indexes.

MongoDB supports compound indexes, where a single index structure
holds references to multiple fields within a collection’s documents.
The following diagram illustrates an example of a compound index on
two fields:

enter image description here

When you create a compound index, 1 Index will hold multiple fields. So if we index a collection by {"sex" : 1, "name" : 1}, the index will look roughly like:

["male","Rick"] -> 0x0c965148
["male","John"] -> 0x0c965149
["male","Sean"] -> 0x0cdf7859
["male","Bro"] ->> 0x0cdf7859
...
["female","Kate"] -> 0x0c965134
["female","Katy"] -> 0x0c965126
["female","Naji"] -> 0x0c965183
["female","Joan"] -> 0x0c965191
["female","Sara"] -> 0x0c965103

If we index a collection by {"name" : 1, "sex" : 1}, the index will look roughly like:

["John","male"] -> 0x0c965148
["John","female"] -> 0x0c965149
["John","male"] -> 0x0cdf7859
["Rick","male"] -> 0x0cdf7859
...
["Kate","female"] -> 0x0c965134
["Katy","female"] -> 0x0c965126
["Naji","female"] -> 0x0c965183
["Joan","female"] -> 0x0c965191
["Sara","female"] -> 0x0c965103

Having {name:1} as the Prefix will serve you much better in using compound indexes. There is much more that can be read on the topic, I hope this can offer some clarity.

Leave a Comment