In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
MongoDB weird problem: sh.stopBalancer stuck
Background
Part1: write at the front
When using a MongoDB sharding cluster, we use the following command to manage the startup and shutdown of Balancer:
> sh.stopBalancer () stop Balancer > sh.startBalancer () enable Balancer
Part2: background
When balancer is enabled, the customer reports that the writing of the frontend application is slow and the query timed out. So we tried to shut down balancer to avoid the impact of chunk migration on cluster performance.
But when you call sh.stopBalancer, you find that you can't stop, and the sh.stopBalancer will be stuck:
Mongos > sh.stopBalancer () Waiting for active hosts...Waiting for the balancer lock...assert.soon failed Msg:Waited too long for lock balancer to unlockdoassert@src/mongo/shell/assert.js:18:14assert.soon@src/mongo/shell/assert.js:202:13sh.waitForDLock@src/mongo/shell/utils_sh.js:198:1sh.waitForBalancerOff@src/mongo/shell/utils_sh.js:264:9sh.waitForBalancer@src/mongo/shell/utils_sh.js:294:9sh.stopBalancer@src/mongo/shell/utils_sh.js:161:5@ (shell): 1:1Balancer still may be active You must manually verify this is not the case using theconfig.changelog collection.2018-02-11T16:28:29.753+0800E QUERY [thread1] Error: Error:assert.soon failed, msg:Waited too long for lock balancer to unlock: sh.waitForBalancerOff@src/mongo/shell/utils_sh.js:268:15sh.waitForBalancer@src/mongo/shell/utils_sh.js:294:9sh.stopBalancer@src/mongo/shell/utils_sh.js:161:5@ (shell): 1:1
From the above error report, we can see that it is due to the fact that balancer is currently running.
Warning: warning in version 3.4, balancer runs on the primary node of config server, while in earlier versions, balancer runs on mongos. When the balancer process is active, the master server of the config server replica set acquires the balancer lock by modifying the documents in the lock collection of the config database. The balancer lock can only be released on its own initiative.
Part3: troubleshooting method
When we call the sh.status () command, we can see that the current balancer is closed, but the running is still yes, indicating that there is a migration running. Balancer:Currently enabled: noCurrently running: yes We check and find that the migrations collection is empty, indicating that no collection is migrating mongos > db.migrations.find () Let's check the information under the locks collection The description in state 2 is holding locks mongos > db.locks.find () {"_ id": "balancer", "state": 2, "ts": ObjectId ("5a324c42329457086086da07"), "who": "ConfigServer:Balancer", "process": "ConfigServer", "when": ISODate ("2018-01-31T08:33:43.346Z"), "why": "CSRS Balancer"}
Warning: warning
The why column in the locks collection tells us why the lock is held. If there is a document being migrated, its status should be 2. Migrating chunk (s) in collection db.collationname.
As of version 3.4, the status field of balancer will always be a value of 2 to prevent older mongos instances from performing a balancing operation. The when field refers to the time when the config server member becomes the primary node.
Solution.
Part1: write at the front
There are several common possible reasons why sh.stopBalancer can't stop:
Chunk migration is in progress. You must wait for chunk migration to complete before you can stop normally.
The server time of the backend is out of sync
The mongo client version is lower than the server side, and this is the third case. The version of the mongo client is version 3.2, and both config server and mongod are version 3.4 mongo.
Part2: the solution
Replace the old version of the mongo client and use the version 3.4 client
Mongos > sh.stopBalancer () {"ok": 1} config:PRIMARY > db.version () 3.4.9-2.9
Part3: cause analysis
The reason for the jam is that the client mongo is version 3.2 and the config node is version 3.4. When version 3.2 of mongos executes stopBalancer (), the stopBalancer code assumes that if the balancerStop command is not found, it will use the old version of logic and wait for the lock to be released. As of version 3.4, the Balance process is moved from the primary node of the configer server of mongos.
-- Summary.
Through this case, we can learn about the problems caused by the mongo client version and what are the common reasons why sh.stopBalancer can't stop. As the author's level is limited and the writing time is very short, it is inevitable that there will be some errors or inaccuracies in the article. I urge readers to criticize and correct them. Like the author's article, click a wave of attention in the upper right corner, thank you.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.