MongoDB Performance 101: When & How You Should Use Partial Indexes
This article was inspired when I am taking the MongoDB Performance Course exam. I thought I understood it quite well and turned out that my score was pretty bad. This leads me to write this article and allows me to study in-depth on the topic and at the same time relate it with a practical use case in today’s world.
What is Partial Index?
Partial index is the index where it only indexes a subset of your documents (data) in your collections. In laymen term, imagine you have a million record of data, but you would only want to index part of it, e.g, 500K.
Why Partial Index?
At this point, you might be thinking why would I do a partial index, why don’t I just index all of it.
You can do that but if the cost is high which will affect the performances and slow down your application. You definitely wouldn’t want this to happen especially when your data is growing fast.
Start giving a better User Experience by improving your application speed.
Implementing Partial Index resulted in two benefits:
- Reduce Storage for Storing the Index
- Lesser indexes lead to better write operation performances.
When I should use Partial Index?
Now we knew what is partial index and realized the importance of having it in our application. Here comes the next question probably is when I should use it?
These are my current general rules of thumb to implement a partial index. It could be different as time grows.
- The data could be filtered by different categories, for e.g, 5 stars hotels, 4 stars hotels and etc.
- Only a subset of data is most frequently queried. For example, users normally tend to search for five stars hotel instead of a one-star hotel.
Now we understood under what kind of conditions or situations we can implement partial index.
Let’s get our hands dirty in the real use case, which is a famous hotel booking site, Agoda.
Use Case: Hotel Booking Site, Agoda
Before I get started, there are some disclaimers that I want to make:
- I do not work at Agoda.
- What I suggest and wrote here is just my personal opinion and interpretation, which means I could be wrong.
Introduction & Scenario
Recently, I am planning to travel to Bangkok. Thus, I’m trying to book a room via Agoda.
After the booking process, I thought that this could be a very good and practical use case to implement partial index. In order to perform the booking, there are several fields required. Let’s see what field is required.

From the screenshot above, we can search hotels using city name, country name and filter using the star rating.
- Destination or Hotel Name
- Star Rating
Now we understand the flow, let’s move on to the next step where we create the database and mock the dataset.
Data Preparation
I wrote a step-by-step guide previously on how to prepare the database model and generate dummy data for performance optimization.
What do I need?
- I need a list of records that contains hotel information. The hotel information includes hotel_name, city, country, stars. This is the minimal field in our collection in order to create a partial index. I will also create two more fields, which are price and average_review, just to make this more closely resemble real production data.
- I also require a data generation tool to generate a million-document dataset into our collection as that allows us to discover more significant findings, especially in query time. After some googling, I discovered a tool on GitHub, mgodatagen, which allows me to generate random data in MongoDB with very minimal configuration.
The details of the step are written in this article (This article is free!!). Feel free to check it out.
Problem identification
This section describes what kind of performance problems we encountered. We have to know what kind of performance problems we encountered before we optimize the performance.
What motivates you to implement partial index? The answer is when you have too many indexes and it affects your performances. Let’s create the following pre-requisite and do some experiment.
- Create five million dataset
- Create single-field indexes for the following field: hotel_name, city, country, and star.
The number of indexes is directly proportional to the number of data.
This is why we’re generating so much data. Besides, users tend to search hotels through city, country and star rating and also hotel name. Thus, it is a good practice to create an index for all of these fields to optimize your query performance.
Let’s have a look at the indexes we had created. According to the screenshot below, we generated almost 144.1MB indexes.

Let’s try to insert a hotel into the database and check for the writing speed.
Before you are able to check the writing speed, you have to configure your database profiling level. Run the following Mongo CLI command.
db.setProfilingLevel(2)
By setting the profiling level to 2, it means the profiler will collect data for all operations. You can refer to the docs here for more details about database profiler.
Now, let’s try to insert a hotel into the DB.

In order to check the writing performance, we can simply run the following Mongo CLI command by filtering using our collection name and insert operation.
db.system.profile.find({ ns: "property.hotel5m", op: "insert"}).pretty()

From the screenshot above, we’re using 16ms to insert a new record into the database. This doesn’t only affect insert operation, it will also affect the execution time for update as well. If you frequently perform updates to the hotel collections, then this is something you might want to consider to optimize.
Solution Optimization
Now, let’s try optimizing by removing the star indexes and create a partial index for three stars hotels and above.
Step 1: Remove star single-field indexes
You can easily remove indexes using MongoDB Compass. Refer to the screenshot below, click on the rubbish bin icon in the red box and key in the index name in the provided field within confirmation dialog that shown.


Step 2: Verify writes performance again!!
At this stage, we have removed the star indexes. Let’s try to create some new data into the collection.

On average, the write performance did increase 20–30% after several trials. Thus, it might worth the effort for us to create a partial index if we do really care about the write performances for this particular collection.
However, removing this index gives us much bigger drawbacks which the query on searching or sorting based on star rating will be very slow. Refer to the screenshot below.

We search for 5 stars hotel in Indonesia, however, there are 20546 hotels in Indonesia but only 4264 hotels are 5 stars hotels which are only 20% of the examined documents. Imagine you work for $100, but at the end of the day, you only get paid $20. Isn’t that frustrating?
Besides, the query execution time is 55ms. You might think 55ms is reasonable, but trust me, you can make this better.
Let’s create a partial compound index in the next step and see things will get better or not!
Step 3: Create a partial compound index for 3 stars hotel and above
Creating a partial index is very simple in this scenario. You’ll just have to specify the condition of the document that should be indexed. What I mentioned can be done using one single command.
db.hotel5m.createIndex({
country: 1,
star: 1
}, { partialFilterExpression: {
star: { $gte: 3 }
}});
What this command does is it call MongoDB the create index using city and star field, star:1 here means ascending order. Besides, it will do a filter which only star that is greater than or equal to 3 will get indexed. Pretty straightforward right?
After you run the command above and created the partial index. You will be able to see it via MongoDB Compass like the screenshot below.

Step 4: Verify the Write and Read Performance
We have created the partial index in the previous step. Next, we should proceed to verify the performances after creating the partial index.
On average, the write performance did increase compared to initial — 16ms. In the following screenshot, we even achieve 3ms. The write performance is a bit hard to examine in this stage as it is fluctuating.

However, the read performance did increase significantly. Aside from shorter query execution time, the efficiency also increases significantly which is 100%. Refer to the screenshot below.

On average, we achieve 13ms query execution time which is 500% improvement from the previous step. Besides, the query was very efficient. The ratio between documents returned & index keys examined is 1, which is superb. The read performance is optimized.
Conclusion
After doing this experiment, I discovered indexes have very minimal impact on the write operation performances. The impact is not significant.
However, it’s really up to the number of indexes you have in a collection and also how frequent the write operation to the database. For e.g, if you do have frequent updates or insert to a collection, then you might be more concern about the write performance compared to the read performance.
Besides, if your business needs really allow you to create a partial index which allows you doesn’t have to index everything, then this is the best practice you should do.
That’s the end of this article and I hope this article does provide you some understanding on partial index and also a hint on when & how you should implement partial index.
Thanks for reading.
References
- MongoDB Partial Index Documentation
- MongoDB Write Performance Documentation
- MongoDB Database Profiler Manual