In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces the example analysis of the index in MongoDB, which is very detailed and has certain reference value. Friends who are interested must finish it!
What on earth is an index?
Most developers come into contact with the index, probably know the index similar to the book catalog, you need to find the content you want, through the directory to find the qualified keywords, and then find the corresponding chapter pageno, and then find the specific content.
In the data structure, the simplest index implementation is similar to hashmap, which maps to a specific location and finds specific content through the keyword key. But in addition to the hash approach, there are many ways to implement an index.
(1) various implementation methods and characteristics of the index
Hash / b-tree / b+-tree redis HSET / MongoDB&PostgreSQL / MySQL
Hashmap
In one picture, see the difference between b-tree and b+-tree:
B+-tree leaves store data, non-leaves store indexes, no data, and there is link between leaves.
B-tree non-leaf storable data
In terms of algorithm search complexity:
Hash close to O (1)
B-tree O (1) ~ O (Log (n)) faster average search time, unstable query time
B + tree O (Log (n)) continuous data, stability of query
As for why the implementation of MongoDB chose b-tree over b+-tree?
There are many articles on the Internet, which are not the focus of this article.
(2) Storage of data & index
Index is stored in memory as much as possible, followed by data.
Note that only the necessary index is retained, and the memory is used on the blade as much as possible.
If the index memory is close to filling up the memory, it's easy to read the disk and the speed slows down.
(3) knowing the realization of the index & thinking after the storage principle
Insert/update/delete will trigger rebalance tree, so, add, delete and modify data, the index will trigger modification, and the performance will be compromised. The more indexes, the better. In that case, which fields should be selected as the index? What happens when these conditions are used in the query?
Take the simplest hashmap, why complexity is not O (1), but so-called close to O (1). Because there is a key conflict / repetition, when DB goes to find it, if there is a lot of key conflict data, it is still the turn to continue to look for it. The same is true of b-tree 's choice of key.
So a common mistake made by most developers is to index undifferentiated key. For example: many only centralized categories of type/status documents count up to hundreds of thousands of collection, usually this kind of index is not helpful.
II. Composite indexes (1) Composite indexes are not as many as possible
If you don't want to build extra indexes, your developer colleagues sometimes struggle with the choice of composite & single fields. According to the typical scenario, do a few experiments:
A loans collection is created here. The simplification has only 100 pieces of data. This is the list of loans: _ id, userId, status (loan status), amount (amount).
Db.loans.count () 100
Db.loans.find ({"userId": "59e022d33f239800129c61c7", "status": "repayed",}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"$and": [{"status": {"$eq": "repayed"}} {"userId": {"$eq": "59e022d33f239800129c61c7"}]}, "queryHash": "15D5A9A1", "planCacheKey": "15D5A9A1", "winningPlan": {"stage": "COLLSCAN", "filter": {"$and": [{"status": {"$eq": "repayed"} {"userId": {"$eq": "59e022d33f239800129c61c7"}]}, "direction": "forward"}, "rejectedPlans": []}, "serverInfo": {"host": "RMBAP", "port": 27017, "version": "4.1.11", "gitVersion": "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"}, "ok": 1}
Notice that the COLLSCAN table above is scanned because there is no index. Next, let's set up several indexes.
Step 1 first establishes {userId:1, status:1}
Db.loans.createIndex ({userId:1, status:1}) {"createdCollectionAutomatically": false, "numIndexesBefore": 1, "numIndexesAfter": 2, "ok": 1} db.loans.find ({"userId": "59e022d33f239800129c61c7", "status": "repayed",}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false "parsedQuery": {"$and": [{"status": {"$eq": "repayed"}, {"userId": {"$eq": "59e022d33f239800129c61c7"}]}, "queryHash": "15D5A9A1", "planCacheKey": "BB87F2BA", "winningPlan": {"stage": "FETCH" "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1, "status": 1}, "indexName": "userId_1_status_1", "isMultiKey": false, "multiKeyPaths": {"userId": [], "status": []}, "isUnique": false "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward", "indexBounds": {"userId": ["[" 59e022d33f239800129c61c7 "," 59e022d33f239800129c61c7 "]"], "status": ["[" repayed "," repayed "]"]}}, "rejectedPlans": []} "serverInfo": {"host": "RMBAP", "port": 27017, "version": "4.1.11", "gitVersion": "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"}, "ok": 1}
Result: hit {userId:1, status:1} as winning plan as desired.
Step2: create a typical index userId
Db.loans.createIndex ({userId:1}) {"createdCollectionAutomatically": false, "numIndexesBefore": 2, "numIndexesAfter": 3, "ok": 1} db.loans.find ({"userId": "59e022d33f239800129c61c7", "status": "repayed",}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false "parsedQuery": {"$and": [{"status": {"$eq": "repayed"}, {"userId": {"$eq": "59e022d33f239800129c61c7"}]}, "queryHash": "15D5A9A1", "planCacheKey": "1B1A4861", "winningPlan": {"stage": "FETCH" "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1, "status": 1}, "indexName": "userId_1_status_1", "isMultiKey": false, "multiKeyPaths": {"userId": [], "status": []}, "isUnique": false "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward", "indexBounds": {"userId": ["[\" 59e022d33f239800129c61c7\ ",\" 59e022d33f239800129c61c7\ "]"], "status": ["[\" repayed\ ",\" repayed\ "]]} "rejectedPlans": [{"stage": "FETCH", "filter": {"status": {"$eq": "repayed"}}, "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1}, "indexName": "userId_1", "isMultiKey": false "multiKeyPaths": {"userId": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward", "indexBounds": {"userId": ["59e022d33f239800129c61c7" "59e022d33f239800129c61c7"] "]}}]}," serverInfo ": {" host ":" RMBAP "," port ": 27017," version ":" 4.1.11 "," gitVersion ":" 1b8a9f5dc5c3314042b55e7415a2a25045b32a94 "}," ok ": 1}
Notice that DB detected that {userId:1, status:1} is a better solution.
Db.loans.find ({"userId": "59e022d33f239800129c61c7"}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"userId": {"$eq": "59e022d33f239800129c61c7"}}, "queryHash": "B1777DBA", "planCacheKey": "1F09D68E", "winningPlan": {"stage": "FETCH" "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1}, "indexName": "userId_1", "isMultiKey": false, "multiKeyPaths": {"userId": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2 "direction": "forward", "indexBounds": {"userId": ["[" 59e022d33f239800129c61c7 "," 59e022d33f239800129c61c7 "]]}," rejectedPlans ": [{" stage ":" FETCH "," inputStage ": {" stage ":" IXSCAN "," keyPattern ": {" userId ": 1 "status": 1}, "indexName": "userId_1_status_1", "isMultiKey": false, "multiKeyPaths": {"userId": [], "status": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2 "direction": "forward", "indexBounds": {"userId": ["59e022d33f239800129c61c7", "59e022d33f239800129c61c7"] "]," status ": [" [MinKey, MaxKey] "]}}]}," serverInfo ": {" host ":" RMBAP "," port ": 27017 "version": "4.1.11", "gitVersion": "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"}, "ok": 1}
Notice that DB detected {userId:1} as a better implementation scheme, um, as we expected.
Db.loans.find ({"status": "repayed"}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"status": {"$eq": "repayed"}}, "queryHash": "E6304EB6", "planCacheKey": "7A94191B", "winningPlan": {"stage": "COLLSCAN" "filter": {"status": {"$eq": "repayed"}, "direction": "forward"}, "rejectedPlans": []}, "serverInfo": {"host": "RMBAP", "port": 27017, "version": "4.1.11", "gitVersion": "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"}, "ok": 1}
Interesting part: status misses the index, full table scan
For the next step, add a sort:
Db.loans.find ({"userId": "59e022d33f239800129c61c7"}). Sort ({status:1}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"userId": {"$eq": "59e022d33f239800129c61c7"}}, "queryHash": "F5ABB1AA", "planCacheKey": "764CBAA8" WinningPlan: {"stage": "FETCH", "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1, "status": 1}, "indexName": "userId_1_status_1", "isMultiKey": false, "multiKeyPaths": {"userId": [] "status": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward", "indexBounds": {"userId": ["[" 59e022d33f239800129c61c7 "," 59e022d33f239800129c61c7 "]"], "status": [[MinKey " MaxKey] "]}}," rejectedPlans ": [{" stage ":" SORT "," sortPattern ": {" status ": 1}," inputStage ": {" stage ":" SORT_KEY_GENERATOR "," inputStage ": {" stage ":" FETCH "," inputStage ": {" stage ":" IXSCAN " "keyPattern": {"userId": 1}, "indexName": "userId_1", "isMultiKey": false, "multiKeyPaths": {"userId": []}, "isUnique": false, "isSparse": false, "isPartial": false "indexVersion": 2, "direction": "forward", "indexBounds": {"userId": ["[" 59e022d33f239800129c61c7 "," 59e022d33f239800129c61c7 "]]}}]}}," serverInfo ": {" host ":" RMBAP "," port ": 27017 "version": "4.1.11", "gitVersion": "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"}, "ok": 1} (II) other attempts
Interesting part: status misses the index
Db.loans.find ({"status": "repayed", "userId": "59e022d33f239800129c61c7",}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"$and": [{"status": {"$eq": "repayed"}} {"userId": {"$eq": "59e022d33f239800129c61c7"}]}, "queryHash": "15D5A9A1", "planCacheKey": "1B1A4861", "winningPlan": {"stage": "FETCH", "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1, "status": 1} "indexName": "userId_1_status_1", "isMultiKey": false, "multiKeyPaths": {"userId": [], "status": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward" "indexBounds": {"userId": ["[\" 59e022d33f239800129c61c7\ ",\" 59e022d33f239800129c61c7\ "]," status ": [" [\ "repayed\",\ "repayed\"]]}, "rejectedPlans": [{"stage": "FETCH" "filter": {"status": {"$eq": "repayed"}}, "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1}, "indexName": "userId_1", "isMultiKey": false "multiKeyPaths": {"userId": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward", "indexBounds": {"userId": ["59e022d33f239800129c61c7" "59e022d33f239800129c61c7"] "]}}]}," serverInfo ": {" host ":" RMBAP "," port ": 27017," version ":" 4.1.11 "," gitVersion ":" 1b8a9f5dc5c3314042b55e7415a2a25045b32a94 "}," ok ": 1}
Hitting the index has nothing to do with the order of the fields in query, as we guessed.
In the interesting part, we delete the index {userId:1}.
Db.loans.dropIndex ({"userId": 1}) {"nIndexesWas": 3, "ok": 1} db.loans.find ({"userId": "59e022d33f239800129c61c7",}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"userId": {"$eq": "59e022d33f239800129c61c7"}, "queryHash": "B1777DBA" "planCacheKey": "5776AB9C", "winningPlan": {"stage": "FETCH", "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1, "status": 1}, "indexName": "userId_1_status_1", "isMultiKey": false, "multiKeyPaths": {"userId": [] "status": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward", "indexBounds": {"userId": ["[" 59e022d33f239800129c61c7 "," 59e022d33f239800129c61c7 "]"], "status": [[MinKey " MaxKey] "]}}," rejectedPlans ": []}," serverInfo ": {" host ":" RMBAP "," port ": 27017," version ":" 4.1.11 "," gitVersion ":" 1b8a9f5dc5c3314042b55e7415a2a25045b32a94 "}," ok ": 1}
The DB execution analyzer thinks that the index {userId:1, status:1} is better and misses the composite index because status is not leading field.
Db.loans.find ({"status": "repayed"}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"status": {"$eq": "repayed"}}, "queryHash": "E6304EB6", "planCacheKey": "7A94191B", "winningPlan": {"stage": "COLLSCAN" "filter": {"status": {"$eq": "repayed"}, "direction": "forward"}, "rejectedPlans": []}, "serverInfo": {"host": "RMBAP", "port": 27017, "version": "4.1.11", "gitVersion": "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"}, "ok": 1}
Change the angle sort again, and swap with the previous query & sort, which used to be:
Db.loans.find ({userId:1}) .sort ({"status": "repayed"})
See what's the difference?
Db.loans.find ({"status": "repayed"}). Sort ({userId:1}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"status": {"$eq": "repayed"}}, "queryHash": "56EA6313", "planCacheKey": "2CFCDA7F" "winningPlan": {"stage": "FETCH", "filter": {"status": {"$eq": "repayed"}, "inputStage": {"stage": "IXSCAN", "keyPattern": {"userId": 1, "status": 1}, "indexName": "userId_1_status_1", "isMultiKey": false "multiKeyPaths": {"userId": [], "status": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward", "indexBounds": {"userId": ["[MinKey, MaxKey]"] "status": ["[MinKey, MaxKey]"]}}, "rejectedPlans": []}, "serverInfo": {"host": "RMBAP", "port": 27017, "version": "4.1.11", "gitVersion": "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"}, "ok": 1}
Such as guessing, hit the index.
Let's play again and confirm the leading filed test:
Db.loans.dropIndex ("userId_1_status_1") {"nIndexesWas": 2, "ok": 1} db.loans.getIndexes () [{"v": 2, "key": {"id": 1}, "name": "id_", "ns": "cashLoan.loans"}] db.loans.createIndex ({status:1, userId:1}) {"createdCollectionAutomatically": false, "numIndexesBefore": 1, "numIndexesAfter": 2 "ok": 1} db.loans.getIndexes () [{"v": 2, "key": {"id": 1}, "name": "id_", "ns": "cashLoan.loans"}, {"v": 2, "key": {"status": 1, "userId": 1}, "name": "status_1_userId_1" "ns": "cashLoan.loans"}] db.loans.find ({"status": "repayed"}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false, "parsedQuery": {"status": {"$eq": "repayed"}}, "queryHash": "E6304EB6", "planCacheKey": "7A94191B" WinningPlan: {"stage": "FETCH", "inputStage": {"stage": "IXSCAN", "keyPattern": {"status": 1, "userId": 1}, "indexName": "status_1_userId_1", "isMultiKey": false, "multiKeyPaths": {"status": [] "userId": []}, "isUnique": false, "isSparse": false, "isPartial": false, "indexVersion": 2, "direction": "forward", "indexBounds": {"status": ["[" repayed "," repayed "]"], "userId": [[MinKey " MaxKey] "]}}," rejectedPlans ": []}," serverInfo ": {" host ":" RMBAP "," port ": 27017," version ":" 4.1.11 "," gitVersion ":" 1b8a9f5dc5c3314042b55e7415a2a25045b32a94 "}," ok ": 1} db.loans.getIndexes () [{" v ": 2," key ": {" id ": 1}," name ":" id_ " "ns": "cashLoan.loans"}, {"v": 2, "key": {"status": 1, "userId": 1}, "name": "status_1_userId_1", "ns": "cashLoan.loans"}] db.loans.find ({"userId": "59e022d33f239800129c61c7",}). Explain () {"queryPlanner": {"plannerVersion": 1, "namespace": "cashLoan.loans", "indexFilterSet": false "parsedQuery": {"userId": {"$eq": "59e022d33f239800129c61c7"}, "queryHash": "B1777DBA", "planCacheKey": "5776AB9C", "winningPlan": {"stage": "COLLSCAN", "filter": {"userId": {"$eq": "59e022d33f239800129c61c7"}}, "direction": "forward"}, "rejectedPlans": []} "serverInfo": {"host": "RMBAP", "port": 27017, "version": "4.1.11", "gitVersion": "1b8a9f5dc5c3314042b55e7415a2a25045b32a94"}, "ok": 1}
After watching this experiment, do you understand the difference between {userId:1, status:1} vs {status:1,userId:1}?
PS: actually, the differentiation of status in this case is not high. It is only shown here as an example.
III. Summary:
Pay attention to use, use frequency, distinguish between high / commonly used ones in front
If you need to reduce indexes to save memory/ to improve the performance of modified data, you can retain highly differentiated, commonly used, undifferentiated and uncommonly used indexes.
Learn to verify analysis performance with explain ():
DB generally has the analysis of actuator optimization, and MySQL & MongoDB use explain to do the analysis.
Syntactically MySQL:
Explain your_sql
MongoDB:
Yoursql.explain ()
Summarize the typical: the ideal query is to combine explain metrics, and they are usually a mixture of multiple:
IXSCAN: index hit
Limit: with limit
Projection: equivalent to non-select *
Docs Size less is better
Docs Examined less is better
NReturned=totalDocsExamined=totalKeysExamined
SORT in index: sort also hits the index. Otherwise, you need to sort the data again after getting the data.
Limit Array elements: limit the number of entries returned by the array, and the array should not have too much data, otherwise the schema design is unreasonable.
The above is all the contents of the article "sample Analysis of Indexes in MongoDB". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.