Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of Index in MongoDB

2025-04-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces the example analysis of the index in MongoDB, which is very detailed and has certain reference value. Friends who are interested must finish it!

What on earth is an index?

Most developers come into contact with the index, probably know the index similar to the book catalog, you need to find the content you want, through the directory to find the qualified keywords, and then find the corresponding chapter pageno, and then find the specific content.

In the data structure, the simplest index implementation is similar to hashmap, which maps to a specific location and finds specific content through the keyword key. But in addition to the hash approach, there are many ways to implement an index.

(1) various implementation methods and characteristics of the index

Hash / b-tree / b+-tree redis HSET / MongoDB&PostgreSQL / MySQL

Hashmap

In one picture, see the difference between b-tree and b+-tree:

B+-tree leaves store data, non-leaves store indexes, no data, and there is link between leaves.

B-tree non-leaf storable data

In terms of algorithm search complexity:

Hash close to O (1)

B-tree O (1) ~ O (Log (n)) faster average search time, unstable query time

B + tree O (Log (n)) continuous data, stability of query

As for why the implementation of MongoDB chose b-tree over b+-tree?

There are many articles on the Internet, which are not the focus of this article.

(2) Storage of data & index

Index is stored in memory as much as possible, followed by data.

Note that only the necessary index is retained, and the memory is used on the blade as much as possible.

If the index memory is close to filling up the memory, it's easy to read the disk and the speed slows down.

(3) knowing the realization of the index & thinking after the storage principle

Insert/update/delete will trigger rebalance tree, so, add, delete and modify data, the index will trigger modification, and the performance will be compromised. The more indexes, the better. In that case, which fields should be selected as the index? What happens when these conditions are used in the query?

Take the simplest hashmap, why complexity is not O (1), but so-called close to O (1). Because there is a key conflict / repetition, when DB goes to find it, if there is a lot of key conflict data, it is still the turn to continue to look for it. The same is true of b-tree 's choice of key.

So a common mistake made by most developers is to index undifferentiated key. For example: many only centralized categories of type/status documents count up to hundreds of thousands of collection, usually this kind of index is not helpful.

II. Composite indexes (1) Composite indexes are not as many as possible

If you don't want to build extra indexes, your developer colleagues sometimes struggle with the choice of composite & single fields. According to the typical scenario, do a few experiments:

A loans collection is created here. The simplification has only 100 pieces of data. This is the list of loans: _ id, userId, status (loan status), amount (amount).

Db.loans.count () 100

Db.loans.find ({"userId": "59e022d33f239800129c61c7", "status": "repayed",}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"$and": [{"status": {"$eq": "repayed"}} {"userId": {"$eq": "59e022d33f239800129c61c7"}]}, "queryHash": "15D5A9A1", "planCacheKey": "15D5A9A1", "winningPlan": {"stage": "COLLSCAN", "filter": {"$and": [{"status": {"$eq": "repayed"} {"userId": {"$eq": "59e022d33f239800129c61c7"}]}, "direction": "forward"}, "rejectedPlans": []}, "serverInfo": {"host": "RMBAP", "port": 27017, "version": "4.1.11", "gitVersion": "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"}, "ok": 1}

Notice that the COLLSCAN table above is scanned because there is no index. Next, let's set up several indexes.

Step 1 first establishes {userId:1, status:1}

Db.loans.createIndex ({userId:1, status:1}) {"createdCollectionAutomatically": false, "numIndexesBefore": 1, "numIndexesAfter": 2, "ok": 1} db.loans.find ({"userId": "59e022d33f239800129c61c7", "status": "repayed",}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false "parsedQuery": {"$and": [{"status": {"$eq": "repayed"}, {"userId": {"$eq": "59e022d33f239800129c61c7"}]}, "queryHash": "15D5A9A1", "planCacheKey": "BB87F2BA", "winningPlan": {"stage": "FETCH" "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1, "status": 1}, "indexName": "userId_1_status_1", "isMultiKey": false, "multiKeyPaths": {"userId": [], "status": []}, "isUnique": false "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward", "indexBounds": {"userId": ["[" 59e022d33f239800129c61c7 "," 59e022d33f239800129c61c7 "]"], "status": ["[" repayed "," repayed "]"]}}, "rejectedPlans": []} "serverInfo": {"host": "RMBAP", "port": 27017, "version": "4.1.11", "gitVersion": "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"}, "ok": 1}

Result: hit {userId:1, status:1} as winning plan as desired.

Step2: create a typical index userId

Db.loans.createIndex ({userId:1}) {"createdCollectionAutomatically": false, "numIndexesBefore": 2, "numIndexesAfter": 3, "ok": 1} db.loans.find ({"userId": "59e022d33f239800129c61c7", "status": "repayed",}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false "parsedQuery": {"$and": [{"status": {"$eq": "repayed"}, {"userId": {"$eq": "59e022d33f239800129c61c7"}]}, "queryHash": "15D5A9A1", "planCacheKey": "1B1A4861", "winningPlan": {"stage": "FETCH" "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1, "status": 1}, "indexName": "userId_1_status_1", "isMultiKey": false, "multiKeyPaths": {"userId": [], "status": []}, "isUnique": false "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward", "indexBounds": {"userId": ["[\" 59e022d33f239800129c61c7\ ",\" 59e022d33f239800129c61c7\ "]"], "status": ["[\" repayed\ ",\" repayed\ "]]} "rejectedPlans": [{"stage": "FETCH", "filter": {"status": {"$eq": "repayed"}}, "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1}, "indexName": "userId_1", "isMultiKey": false "multiKeyPaths": {"userId": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward", "indexBounds": {"userId": ["59e022d33f239800129c61c7" "59e022d33f239800129c61c7"] "]}}]}," serverInfo ": {" host ":" RMBAP "," port ": 27017," version ":" 4.1.11 "," gitVersion ":" 1b8a9f5dc5c3314042b55e7415a2a25045b32a94 "}," ok ": 1}

Notice that DB detected that {userId:1, status:1} is a better solution.

Db.loans.find ({"userId": "59e022d33f239800129c61c7"}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"userId": {"$eq": "59e022d33f239800129c61c7"}}, "queryHash": "B1777DBA", "planCacheKey": "1F09D68E", "winningPlan": {"stage": "FETCH" "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1}, "indexName": "userId_1", "isMultiKey": false, "multiKeyPaths": {"userId": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2 "direction": "forward", "indexBounds": {"userId": ["[" 59e022d33f239800129c61c7 "," 59e022d33f239800129c61c7 "]]}," rejectedPlans ": [{" stage ":" FETCH "," inputStage ": {" stage ":" IXSCAN "," keyPattern ": {" userId ": 1 "status": 1}, "indexName": "userId_1_status_1", "isMultiKey": false, "multiKeyPaths": {"userId": [], "status": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2 "direction": "forward", "indexBounds": {"userId": ["59e022d33f239800129c61c7", "59e022d33f239800129c61c7"] "]," status ": [" [MinKey, MaxKey] "]}}]}," serverInfo ": {" host ":" RMBAP "," port ": 27017 "version": "4.1.11", "gitVersion": "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"}, "ok": 1}

Notice that DB detected {userId:1} as a better implementation scheme, um, as we expected.

Db.loans.find ({"status": "repayed"}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"status": {"$eq": "repayed"}}, "queryHash": "E6304EB6", "planCacheKey": "7A94191B", "winningPlan": {"stage": "COLLSCAN" "filter": {"status": {"$eq": "repayed"}, "direction": "forward"}, "rejectedPlans": []}, "serverInfo": {"host": "RMBAP", "port": 27017, "version": "4.1.11", "gitVersion": "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"}, "ok": 1}

Interesting part: status misses the index, full table scan

For the next step, add a sort:

Db.loans.find ({"userId": "59e022d33f239800129c61c7"}). Sort ({status:1}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"userId": {"$eq": "59e022d33f239800129c61c7"}}, "queryHash": "F5ABB1AA", "planCacheKey": "764CBAA8" WinningPlan: {"stage": "FETCH", "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1, "status": 1}, "indexName": "userId_1_status_1", "isMultiKey": false, "multiKeyPaths": {"userId": [] "status": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward", "indexBounds": {"userId": ["[" 59e022d33f239800129c61c7 "," 59e022d33f239800129c61c7 "]"], "status": [[MinKey " MaxKey] "]}}," rejectedPlans ": [{" stage ":" SORT "," sortPattern ": {" status ": 1}," inputStage ": {" stage ":" SORT_KEY_GENERATOR "," inputStage ": {" stage ":" FETCH "," inputStage ": {" stage ":" IXSCAN " "keyPattern": {"userId": 1}, "indexName": "userId_1", "isMultiKey": false, "multiKeyPaths": {"userId": []}, "isUnique": false, "isSparse": false, "isPartial": false "indexVersion": 2, "direction": "forward", "indexBounds": {"userId": ["[" 59e022d33f239800129c61c7 "," 59e022d33f239800129c61c7 "]]}}]}}," serverInfo ": {" host ":" RMBAP "," port ": 27017 "version": "4.1.11", "gitVersion": "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"}, "ok": 1} (II) other attempts

Interesting part: status misses the index

Db.loans.find ({"status": "repayed", "userId": "59e022d33f239800129c61c7",}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"$and": [{"status": {"$eq": "repayed"}} {"userId": {"$eq": "59e022d33f239800129c61c7"}]}, "queryHash": "15D5A9A1", "planCacheKey": "1B1A4861", "winningPlan": {"stage": "FETCH", "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1, "status": 1} "indexName": "userId_1_status_1", "isMultiKey": false, "multiKeyPaths": {"userId": [], "status": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward" "indexBounds": {"userId": ["[\" 59e022d33f239800129c61c7\ ",\" 59e022d33f239800129c61c7\ "]," status ": [" [\ "repayed\",\ "repayed\"]]}, "rejectedPlans": [{"stage": "FETCH" "filter": {"status": {"$eq": "repayed"}}, "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1}, "indexName": "userId_1", "isMultiKey": false "multiKeyPaths": {"userId": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward", "indexBounds": {"userId": ["59e022d33f239800129c61c7" "59e022d33f239800129c61c7"] "]}}]}," serverInfo ": {" host ":" RMBAP "," port ": 27017," version ":" 4.1.11 "," gitVersion ":" 1b8a9f5dc5c3314042b55e7415a2a25045b32a94 "}," ok ": 1}

Hitting the index has nothing to do with the order of the fields in query, as we guessed.

In the interesting part, we delete the index {userId:1}.

Db.loans.dropIndex ({"userId": 1}) {"nIndexesWas": 3, "ok": 1} db.loans.find ({"userId": "59e022d33f239800129c61c7",}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"userId": {"$eq": "59e022d33f239800129c61c7"}, "queryHash": "B1777DBA" "planCacheKey": "5776AB9C", "winningPlan": {"stage": "FETCH", "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1, "status": 1}, "indexName": "userId_1_status_1", "isMultiKey": false, "multiKeyPaths": {"userId": [] "status": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward", "indexBounds": {"userId": ["[" 59e022d33f239800129c61c7 "," 59e022d33f239800129c61c7 "]"], "status": [[MinKey " MaxKey] "]}}," rejectedPlans ": []}," serverInfo ": {" host ":" RMBAP "," port ": 27017," version ":" 4.1.11 "," gitVersion ":" 1b8a9f5dc5c3314042b55e7415a2a25045b32a94 "}," ok ": 1}

The DB execution analyzer thinks that the index {userId:1, status:1} is better and misses the composite index because status is not leading field.

Db.loans.find ({"status": "repayed"}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"status": {"$eq": "repayed"}}, "queryHash": "E6304EB6", "planCacheKey": "7A94191B", "winningPlan": {"stage": "COLLSCAN" "filter": {"status": {"$eq": "repayed"}, "direction": "forward"}, "rejectedPlans": []}, "serverInfo": {"host": "RMBAP", "port": 27017, "version": "4.1.11", "gitVersion": "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"}, "ok": 1}

Change the angle sort again, and swap with the previous query & sort, which used to be:

Db.loans.find ({userId:1}) .sort ({"status": "repayed"})

See what's the difference?

Db.loans.find ({"status": "repayed"}). Sort ({userId:1}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"status": {"$eq": "repayed"}}, "queryHash": "56EA6313", "planCacheKey": "2CFCDA7F" "winningPlan": {"stage": "FETCH", "filter": {"status": {"$eq": "repayed"}, "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1, "status": 1}, "indexName": "userId_1_status_1", "isMultiKey": false "multiKeyPaths": {"userId": [], "status": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward", "indexBounds": {"userId": ["[MinKey, MaxKey]"] "status": ["[MinKey, MaxKey]"]}}, "rejectedPlans": []}, "serverInfo": {"host": "RMBAP", "port": 27017, "version": "4.1.11", "gitVersion": "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"}, "ok": 1}

Such as guessing, hit the index.

Let's play again and confirm the leading filed test:

Db.loans.dropIndex ("userId_1_status_1") {"nIndexesWas": 2, "ok": 1} db.loans.getIndexes () [{"v": 2, "key": {"id": 1}, "name": "id_", "ns": "cashLoan.loans"}] db.loans.createIndex ({status:1, userId:1}) {"createdCollectionAutomatically": false, "numIndexesBefore": 1, "numIndexesAfter": 2 "ok": 1} db.loans.getIndexes () [{"v": 2, "key": {"id": 1}, "name": "id_", "ns": "cashLoan.loans"}, {"v": 2, "key": {"status": 1, "userId": 1}, "name": "status_1_userId_1" "ns": "cashLoan.loans"}] db.loans.find ({"status": "repayed"}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"status": {"$eq": "repayed"}}, "queryHash": "E6304EB6", "planCacheKey": "7A94191B" WinningPlan: {"stage": "FETCH", "inputStage": {"stage": "IXSCAN", "keyPattern": {"status": 1, "userId": 1}, "indexName": "status_1_userId_1", "isMultiKey": false, "multiKeyPaths": {"status": [] "userId": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward", "indexBounds": {"status": ["[" repayed "," repayed "]"], "userId": [[MinKey " MaxKey] "]}}," rejectedPlans ": []}," serverInfo ": {" host ":" RMBAP "," port ": 27017," version ":" 4.1.11 "," gitVersion ":" 1b8a9f5dc5c3314042b55e7415a2a25045b32a94 "}," ok ": 1} db.loans.getIndexes () [{" v ": 2," key ": {" id ": 1}," name ":" id_ " "ns": "cashLoan.loans"}, {"v": 2, "key": {"status": 1, "userId": 1}, "name": "status_1_userId_1", "ns": "cashLoan.loans"}] db.loans.find ({"userId": "59e022d33f239800129c61c7",}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false "parsedQuery": {"userId": {"$eq": "59e022d33f239800129c61c7"}, "queryHash": "B1777DBA", "planCacheKey": "5776AB9C", "winningPlan": {"stage": "COLLSCAN", "filter": {"userId": {"$eq": "59e022d33f239800129c61c7"}}, "direction": "forward"}, "rejectedPlans": []} "serverInfo": {"host": "RMBAP", "port": 27017, "version": "4.1.11", "gitVersion": "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"}, "ok": 1}

After watching this experiment, do you understand the difference between {userId:1, status:1} vs {status:1,userId:1}?

PS: actually, the differentiation of status in this case is not high. It is only shown here as an example.

III. Summary:

Pay attention to use, use frequency, distinguish between high / commonly used ones in front

If you need to reduce indexes to save memory/ to improve the performance of modified data, you can retain highly differentiated, commonly used, undifferentiated and uncommonly used indexes.

Learn to verify analysis performance with explain ():

DB generally has the analysis of actuator optimization, and MySQL & MongoDB use explain to do the analysis.

Syntactically MySQL:

Explain your_sql

MongoDB:

Yoursql.explain ()

Summarize the typical: the ideal query is to combine explain metrics, and they are usually a mixture of multiple:

IXSCAN: index hit

Limit: with limit

Projection: equivalent to non-select *

Docs Size less is better

Docs Examined less is better

NReturned=totalDocsExamined=totalKeysExamined

SORT in index: sort also hits the index. Otherwise, you need to sort the data again after getting the data.

Limit Array elements: limit the number of entries returned by the array, and the array should not have too much data, otherwise the schema design is unreasonable.

The above is all the contents of the article "sample Analysis of Indexes in MongoDB". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report