50 suggestions for MongoDB developers Tip22 07/11 Update SLTechnology News&Howtos

50 suggestions for MongoDB developers Tip22

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

This series of articles are translated from "50 Tips and Tricks for MongoDB Developers". I can't find the Chinese version for the time being. Anyway, I have been studying mongodb deeply recently, so I just want to translate it. On the one hand, enhance the effect of our own learning, on the other hand, let the big family to experience what we mongodb users need to pay attention to.

First of all, I declare that my English level is not too high, and some English translation into Chinese can not find suitable words to express, so the original English words may appear in the article, or the translation in some places will be a little blunt, that is to say, there will be literal translation. The main purpose of translating this book is for everyone to learn and explore. If there is any inaccuracy in translation, or if there is a more accurate translation, please point out that I will correct it in time. Thank you in advance.

Tip#22.Use indexes to do more with less memory

Do more with less memory through indexing

Figure 22.1 is a schematic diagram of a query request.

Figure 22.1 query flow chart

Suppose you have a machine with 256 gigabytes of data and 16 gigabytes of memory. Most of the data is in a collection, and you are querying the collection. What would mongodb do?

Suppose each page contains 4K of data.

Mongodb loads the first page from disk into memory and compares it with your query. Then load another page, compare it with your query, and load another page. This will load the entire 256g of data. There is no shortcut, do not look at the document, it does not know which document matches the query request, so it must look at all the documents to know that those documents meet the query criteria. Therefore, you need to load all 256 gigabytes of data into memory (the operating system needs to swap the old data out of memory, and you need to load new data). It will take a long time.

How can you avoid not loading all 256 gigabytes of data into memory in one query? We can tell mongodb to index on field x, and mongodb will build a tree to hold the value of the field. Mongodb preprocesses the data, adding the values of the x fields in the collection to an ordered number (figure 22.2). Each node of the tree contains a value of x and a pointer to the document that contains the value of x.

Figure 22.2 ordered B-tree

The tree contains only a pointer to the execution document, not the document itself, meaning that the index is much smaller than the entire collection.

When your query contains an x field, mongodb will notice that the query condition contains an index field x and will query x through an ordered tree. Now, you're not looking for every document. Mongodb replied, "is x greater than the current node?" Is it less than the value of the current node? If it is greater than, go to the right node to continue looking; if less than, go to the left node to continue looking. The search will continue in this way until a node with a node value of x is found. If found, the corresponding document is found according to the node pointer and returned (figure 22.3).

Figure 22.3 an index containing values points to a document

If we don't use the index, we need to load 64 million pages of data into memory.

Pages of data: 256G/ (4KB/page) = 64 million pages

Suppose our index is 80g. The index is 20 million pages.

Numbers of pages in our index: 80G/ (4KB/page) = 20 million pages

Because the index is ordered, it means we don't have to look for every one. We only need to load specific nodes. So how much is it?

Numbers of pages of the index that must be loaded into memory: len (20000000) = 17 pages

Drop from 64 million pages to 17 pages!

Of course, it's not just page 17. Once we find the results in the index, we need to load the corresponding document into memory, so we also need to load document-sized pages. Compared with the whole collection before, it is still much smaller.

I hope you can imagine how much indexes can help speed up queries.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.