Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to switch replica set failure in MongoDB

2025-04-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

How to switch replica set failures in MongoDB, many novices are not very clear about this. In order to help you solve this problem, the following editor will explain it in detail. People with this need can come and learn. I hope you can get something.

By default, both the master node and the slave node have a priority of 1, and the arbitrator is 0 because it cannot participate in the election.

View cluster configuration

Cjcmonset:PRIMARY > rs.conf ()

{

"_ id": "cjcmonset"

"version": 1

"protocolVersion": NumberLong (1)

"writeConcernMajorityJournalDefault": true

"members": [

{

"_ id": 0

"host": "192.168.2.222 purl 27017"

"arbiterOnly": false

"buildIndexes": true

"hidden": false

"priority": 1

"tags": {

}

"slaveDelay": NumberLong (0)

"votes": 1

}

{

"_ id": 1

"host": "192.168.2.187purl 27017"

"arbiterOnly": false

"buildIndexes": true

"hidden": false

"priority": 1

"tags": {

}

"slaveDelay": NumberLong (0)

"votes": 1

}

{

"_ id": 2

"host": "192.168.2.188 purl 27017"

"arbiterOnly": true

"buildIndexes": true

"hidden": false

"priority": 0

"tags": {

}

"slaveDelay": NumberLong (0)

"votes": 1

}

]

"settings": {

"chainingAllowed": true

"heartbeatIntervalMillis": 2000

"heartbeatTimeoutSecs": 10

"electionTimeoutMillis": 10000

"catchUpTimeoutMillis":-1

"catchUpTakeoverDelayMillis": 30000

"getLastErrorModes": {

}

"getLastErrorDefaults": {

"w": 1

"wtimeout": 0

}

"replicaSetId": ObjectId ("5e77148837ae69b4ab9b4870")

}

}

I raised the priority of the existing master node 2.222 to 5 in order to automatically switch back the role of the master library after the failure of the master library was restored.

Cjcmonset:PRIMARY > var rscfg=rs.conf ()

Cjcmonset:PRIMARY > rscfg.members [0] .priority = 5

five

Cjcmonset:PRIMARY > rs.reconfig (rscfg)

{

"ok": 1

"$clusterTime": {

ClusterTime: Timestamp (1584881617, 1)

"signature": {

"hash": BinData (0, "AAAAAAAAAAAAAAAAAAAAAAAAAAA=")

"keyId": NumberLong (0)

}

}

OperationTime: Timestamp (1584881617, 1)

}

Check the status, the priority of the main library has been set to 5

Cjcmonset:PRIMARY > rs.conf ()

{

"_ id": "cjcmonset"

"version": 2

"protocolVersion": NumberLong (1)

"writeConcernMajorityJournalDefault": true

"members": [

{

"_ id": 0

"host": "192.168.2.222 purl 27017"

"arbiterOnly": false

"buildIndexes": true

"hidden": false

"priority": 5

"tags": {

}

"slaveDelay": NumberLong (0)

"votes": 1

}

{

"_ id": 1

"host": "192.168.2.187purl 27017"

"arbiterOnly": false

"buildIndexes": true

"hidden": false

"priority": 1

"tags": {

}

"slaveDelay": NumberLong (0)

"votes": 1

}

{

"_ id": 2

"host": "192.168.2.188 purl 27017"

"arbiterOnly": true

"buildIndexes": true

"hidden": false

"priority": 0

"tags": {

}

"slaveDelay": NumberLong (0)

"votes": 1

}

]

"settings": {

"chainingAllowed": true

"heartbeatIntervalMillis": 2000

"heartbeatTimeoutSecs": 10

"electionTimeoutMillis": 10000

"catchUpTimeoutMillis":-1

"catchUpTakeoverDelayMillis": 30000

"getLastErrorModes": {

}

"getLastErrorDefaults": {

"w": 1

"wtimeout": 0

}

"replicaSetId": ObjectId ("5e77148837ae69b4ab9b4870")

}

}

Manually will shut down the primary node (2.222) mongodo and test the failover function

Cjcmonset:PRIMARY > use admin

Switched to db admin

Cjcmonset:PRIMARY > db.shutdownServer ()

2020-03-22T20:59:39.419+0800 I NETWORK [js] DBClientConnection failed to receive message from 127.0.0.1 js 27017-HostUnreachable: Connection closed by peer

Server should be down...

2020-03-22T20:59:39.422+0800 I NETWORK [js] trying reconnect to 127.0.0.1 js 27017 failed

2020-03-22T20:59:39.423+0800 I NETWORK [js] reconnect 127.0.0.1 js 27017 failed failed

Check the cluster status on the 2.187 node, and the original master database 2.187 prompts Connection refused, and the original slave database 2.187 has been automatically switched to the master database.

View cluster status

Cjcmonset:PRIMARY > rs.status ()

{

"set": "cjcmonset"

Date: ISODate ("2020-03-22T13:00:33.838Z")

"myState": 1

"term": NumberLong (2)

"syncingTo":

"syncSourceHost":

"syncSourceId":-1

"heartbeatIntervalMillis": NumberLong (2000)

"majorityVoteCount": 2

"writeMajorityCount": 2

"optimes": {

"lastCommittedOpTime": {

Ts: Timestamp (1584881969, 1)

"t": NumberLong (1)

}

LastCommittedWallTime: ISODate ("2020-03-22T12:59:29.481Z")

"readConcernMajorityOpTime": {

Ts: Timestamp (1584881969, 1)

"t": NumberLong (1)

}

ReadConcernMajorityWallTime: ISODate ("2020-03-22T12:59:29.481Z")

"appliedOpTime": {

Ts: Timestamp (1584882028, 1)

"t": NumberLong (2)

}

"durableOpTime": {

Ts: Timestamp (1584882028, 1)

"t": NumberLong (2)

}

LastAppliedWallTime: ISODate ("2020-03-22T13:00:28.344Z")

LastDurableWallTime: ISODate ("2020-03-22T13:00:28.344Z")

}

LastStableRecoveryTimestamp: Timestamp (1584881969, 1)

LastStableCheckpointTimestamp: Timestamp (1584881969, 1)

"electionCandidateMetrics": {

"lastElectionReason": "stepUpRequestSkipDryRun"

LastElectionDate: ISODate ("2020-03-22T12:59:37.752Z")

"electionTerm": NumberLong (2)

"lastCommittedOpTimeAtElection": {

Ts: Timestamp (1584881969, 1)

"t": NumberLong (1)

}

"lastSeenOpTimeAtElection": {

Ts: Timestamp (1584881969, 1)

"t": NumberLong (1)

}

"numVotesNeeded": 2

"priorityAtElection": 1

"electionTimeoutMillis": NumberLong (10000)

"priorPrimaryMemberId": 0

"numCatchUpOps": NumberLong (0)

NewTermStartDate: ISODate ("2020-03-22T12:59:38.313Z")

}

"electionParticipantMetrics": {

"votedForCandidate": true

"electionTerm": NumberLong (1)

LastVoteDate: ISODate ("2020-03-22T07:32:34.460Z")

"electionCandidateMemberId": 0

"voteReason":

"lastAppliedOpTimeAtElection": {

Ts: Timestamp (1584862345, 1)

"t": NumberLong (- 1)

}

"maxAppliedOpTimeInSet": {

Ts: Timestamp (1584862345, 1)

"t": NumberLong (- 1)

}

"priorityAtElection": 1

}

"members": [

{

"_ id": 0

"name": "192.168.2.222 purl 27017"

"health": 0

"state": 8

"stateStr": "(not reachable/healthy)

"uptime": 0

"optime": {

"ts": Timestamp (0,0)

"t": NumberLong (- 1)

}

"optimeDurable": {

"ts": Timestamp (0,0)

"t": NumberLong (- 1)

}

OptimeDate: ISODate ("1970-01-01T00:00:00Z")

OptimeDurableDate: ISODate ("1970-01-01T00:00:00Z")

LastHeartbeat: ISODate ("2020-03-22T13:00:31.874Z")

LastHeartbeatRecv: ISODate ("2020-03-22T12:59:36.547Z")

"pingMs": NumberLong (0)

"lastHeartbeatMessage": "Error connecting to 192.168.2.222 caused by:: Connection refused"

"syncingTo":

"syncSourceHost":

"syncSourceId":-1

"infoMessage":

"configVersion":-1

}

{

"_ id": 1

"name": "192.168.2.187purl 27017"

"health": 1

"state": 1

"stateStr": "PRIMARY"

"uptime": 19745

"optime": {

Ts: Timestamp (1584882028, 1)

"t": NumberLong (2)

}

OptimeDate: ISODate ("2020-03-22T13:00:28Z")

"syncingTo":

"syncSourceHost":

"syncSourceId":-1

"infoMessage":

ElectionTime: Timestamp (1584881977, 1)

ElectionDate: ISODate ("2020-03-22T12:59:37Z")

"configVersion": 2

"self": true

"lastHeartbeatMessage":

}

{

"_ id": 2

"name": "192.168.2.188 purl 27017"

"health": 1

"state": 7

"stateStr": "ARBITER"

"uptime": 19689

LastHeartbeat: ISODate ("2020-03-22T13:00:31.872Z")

LastHeartbeatRecv: ISODate ("2020-03-22T13:00:32.657Z")

"pingMs": NumberLong (0)

"lastHeartbeatMessage":

"syncingTo":

"syncSourceHost":

"syncSourceId":-1

"infoMessage":

"configVersion": 2

}

]

"ok": 1

"$clusterTime": {

ClusterTime: Timestamp (1584882028, 1)

"signature": {

"hash": BinData (0, "AAAAAAAAAAAAAAAAAAAAAAAAAAA=")

"keyId": NumberLong (0)

}

}

OperationTime: Timestamp (1584882028, 1)

}

Manually start the 2.222-node MongoDB

[root@cjcos conf] # mongod-- config / usr/local/mongodb/conf/mongodb.conf

Cjcmonset:SECONDARY > rs.status ()

{

"set": "cjcmonset"

Date: ISODate ("2020-03-22T13:02:32.499Z")

"myState": 2

"term": NumberLong (2)

"syncingTo": "192.168.2.187purl 27017"

"syncSourceHost": "192.168.2.187purl 27017"

"syncSourceId": 1

"heartbeatIntervalMillis": NumberLong (2000)

"majorityVoteCount": 2

"writeMajorityCount": 2

"optimes": {

"lastCommittedOpTime": {

Ts: Timestamp (1584882148, 1)

"t": NumberLong (2)

}

LastCommittedWallTime: ISODate ("2020-03-22T13:02:28.367Z")

"readConcernMajorityOpTime": {

Ts: Timestamp (1584882148, 1)

"t": NumberLong (2)

}

ReadConcernMajorityWallTime: ISODate ("2020-03-22T13:02:28.367Z")

"appliedOpTime": {

Ts: Timestamp (1584882148, 1)

"t": NumberLong (2)

}

"durableOpTime": {

Ts: Timestamp (1584882148, 1)

"t": NumberLong (2)

}

LastAppliedWallTime: ISODate ("2020-03-22T13:02:28.367Z")

LastDurableWallTime: ISODate ("2020-03-22T13:02:28.367Z")

}

LastStableRecoveryTimestamp: Timestamp (1584881969, 1)

LastStableCheckpointTimestamp: Timestamp (1584881969, 1)

"members": [

{

"_ id": 0

"name": "192.168.2.222 purl 27017"

"health": 1

"state": 2

"stateStr": "SECONDARY"

"uptime": 13

"optime": {

Ts: Timestamp (1584882148, 1)

"t": NumberLong (2)

}

OptimeDate: ISODate ("2020-03-22T13:02:28Z")

"syncingTo": "192.168.2.187purl 27017"

"syncSourceHost": "192.168.2.187purl 27017"

"syncSourceId": 1

"infoMessage":

"configVersion": 2

"self": true

"lastHeartbeatMessage":

}

{

"_ id": 1

"name": "192.168.2.187purl 27017"

"health": 1

"state": 1

"stateStr": "PRIMARY"

"uptime": 10

"optime": {

Ts: Timestamp (1584882148, 1)

"t": NumberLong (2)

}

"optimeDurable": {

Ts: Timestamp (1584882148, 1)

"t": NumberLong (2)

}

OptimeDate: ISODate ("2020-03-22T13:02:28Z")

OptimeDurableDate: ISODate ("2020-03-22T13:02:28Z")

LastHeartbeat: ISODate ("2020-03-22T13:02:31.498Z")

LastHeartbeatRecv: ISODate ("2020-03-22T13:02:31.261Z")

"pingMs": NumberLong (0)

"lastHeartbeatMessage":

"syncingTo":

"syncSourceHost":

"syncSourceId":-1

"infoMessage":

ElectionTime: Timestamp (1584881977, 1)

ElectionDate: ISODate ("2020-03-22T12:59:37Z")

"configVersion": 2

}

{

"_ id": 2

"name": "192.168.2.188 purl 27017"

"health": 1

"state": 7

"stateStr": "ARBITER"

"uptime": 10

LastHeartbeat: ISODate ("2020-03-22T13:02:31.496Z")

LastHeartbeatRecv: ISODate ("2020-03-22T13:02:32.014Z")

"pingMs": NumberLong (0)

"lastHeartbeatMessage":

"syncingTo":

"syncSourceHost":

"syncSourceId":-1

"infoMessage":

"configVersion": 2

}

]

"ok": 1

"$clusterTime": {

ClusterTime: Timestamp (1584882148, 1)

"signature": {

"hash": BinData (0, "AAAAAAAAAAAAAAAAAAAAAAAAAAA=")

"keyId": NumberLong (0)

}

}

OperationTime: Timestamp (1584882148, 1)

}

Because the priority is set, soon after starting the 2.222-node mongo, the 2.222 nodes will be re-elected as the master node and the 2.187 nodes will become the slave nodes.

Cjcmonset:SECONDARY >

Cjcmonset:PRIMARY >

Cjcmonset:PRIMARY >

Cjcmonset:PRIMARY >

-2.222 is as follows:

2020-03-22T21:02:33.946+0800 I CONNPOOL [Replication] Connecting to 192.168.2.187purl 27017

2020-03-22T21:02:33.949+0800 I REPL [replexec-0] Member 192.168.2.187 Member 27017 is now in state SECONDARY

2020-03-22T21:02:33.949+0800 I REPL [replexec-0] Caught up to the latest optime known via heartbeats after becoming primary. Target optime: {ts: Timestamp (1584882148, 1), t: 2}. My Last Applied: {ts: Timestamp (1584882148, 1), t: 2}

2020-03-22T21:02:33.949+0800 I REPL [replexec-0] Exited primary catch-up mode.

2020-03-22T21:02:33.949+0800 I REPL [replexec-0] Stopping replication producer

2020-03-22T21:02:33.949+0800 I REPL [rsBackgroundSync] Replication producer stopped after oplog fetcher finished returning a batch from our sync source. Abandoning this batch of oplog entries and re-evaluating our sync source.

2020-03-22T21:02:34.592+0800 I REPL [ReplBatcher] Oplog buffer has been drained in term 3

2020-03-22T21:02:34.592+0800 I REPL [RstlKillOpThread] Starting to kill user operations

2020-03-22T21:02:34.592+0800 I REPL [RstlKillOpThread] Stopped killing user operations

2020-03-22T21:02:34.592+0800 I REPL [RstlKillOpThread] State transition ops metrics: {lastStateTransition: "stepUp", userOpsKilled: 0, userOpsRunning: 0}

2020-03-22T21:02:34.593+0800 I REPL [rsSync-0] transition to primary complete; database writes are now permitted

2020-03-22T21:02:34.712+0800 I REPL [SyncSourceFeedback] SyncSourceFeedback error sending update to 192.168.2.187 SyncSourceFeedback error sending update to 27017: InvalidSyncSource: Sync source was cleared. Was 192.168.2.187:27017

2020-03-22T21:02:35.459+0800 I NETWORK [listener] connection accepted from 192.168.2.187 connection accepted from 41810 # 13 (6 connections now open)

2020-03-22T21:02:35.460+0800 I NETWORK [conn13] received client metadata from 192.168.2.187 received client metadata from 41810 conn13: {driver: {name: "NetworkInterfaceTL", version: "4.2.3"}, os: {type: "Linux", name: "CentOS Linux release 7.5.1804 (Core)", architecture: "x86: 64", version: "Kernel 3.10.0-862.el7.x86_64"}}

2020-03-22T21:02:39.711+0800 I CONNPOOL [RS] Ending connection to host 192.168.2.187 Ending connection to host 27017 due to bad connection status: CallbackCanceled: Callback was canceled; 1 connections to that host remain open

2020-03-22T21:02:43.944+0800 I CONNPOOL [Replication] Ending connection to host 192.168.2.187 Ending connection to host 27017 due to bad connection status: CallbackCanceled: Callback was canceled; 1 connections to that host remain open

Check the cluster status and 2.187 becomes the master node again.

Cjcmonset:PRIMARY > rs.status ()

{

"set": "cjcmonset"

Date: ISODate ("2020-03-22T13:04:24.678Z")

"myState": 1

"term": NumberLong (3)

"syncingTo":

"syncSourceHost":

"syncSourceId":-1

"heartbeatIntervalMillis": NumberLong (2000)

"majorityVoteCount": 2

"writeMajorityCount": 2

"optimes": {

"lastCommittedOpTime": {

Ts: Timestamp (1584882264, 1)

"t": NumberLong (3)

}

LastCommittedWallTime: ISODate ("2020-03-22T13:04:24.632Z")

"readConcernMajorityOpTime": {

Ts: Timestamp (1584882264, 1)

"t": NumberLong (3)

}

ReadConcernMajorityWallTime: ISODate ("2020-03-22T13:04:24.632Z")

"appliedOpTime": {

Ts: Timestamp (1584882264, 1)

"t": NumberLong (3)

}

"durableOpTime": {

Ts: Timestamp (1584882264, 1)

"t": NumberLong (3)

}

LastAppliedWallTime: ISODate ("2020-03-22T13:04:24.632Z")

LastDurableWallTime: ISODate ("2020-03-22T13:04:24.632Z")

}

LastStableRecoveryTimestamp: Timestamp (1584882254, 1)

LastStableCheckpointTimestamp: Timestamp (1584882254, 1)

"electionCandidateMetrics": {

"lastElectionReason": "priorityTakeover"

LastElectionDate: ISODate ("2020-03-22T13:02:33.880Z")

"electionTerm": NumberLong (3)

"lastCommittedOpTimeAtElection": {

Ts: Timestamp (1584882148, 1)

"t": NumberLong (2)

}

"lastSeenOpTimeAtElection": {

Ts: Timestamp (1584882148, 1)

"t": NumberLong (2)

}

"numVotesNeeded": 2

"priorityAtElection": 5

"electionTimeoutMillis": NumberLong (10000)

"priorPrimaryMemberId": 1

"numCatchUpOps": NumberLong (0)

NewTermStartDate: ISODate ("2020-03-22T13:02:34.593Z")

WMajorityWriteAvailabilityDate: ISODate ("2020-03-22T13:02:35.462Z")

}

"members": [

{

"_ id": 0

"name": "192.168.2.222 purl 27017"

"health": 1

"state": 1

"stateStr": "PRIMARY"

"uptime": 125

"optime": {

Ts: Timestamp (1584882264, 1)

"t": NumberLong (3)

}

OptimeDate: ISODate ("2020-03-22T13:04:24Z")

"syncingTo":

"syncSourceHost":

"syncSourceId":-1

"infoMessage":

ElectionTime: Timestamp (1584882153, 1)

ElectionDate: ISODate ("2020-03-22T13:02:33Z")

"configVersion": 2

"self": true

"lastHeartbeatMessage":

}

{

"_ id": 1

"name": "192.168.2.187purl 27017"

"health": 1

"state": 2

"stateStr": "SECONDARY"

"uptime": 122

"optime": {

Ts: Timestamp (1584882254, 1)

"t": NumberLong (3)

}

"optimeDurable": {

Ts: Timestamp (1584882254, 1)

"t": NumberLong (3)

}

OptimeDate: ISODate ("2020-03-22T13:04:14Z")

OptimeDurableDate: ISODate ("2020-03-22T13:04:14Z")

LastHeartbeat: ISODate ("2020-03-22T13:04:24.023Z")

LastHeartbeatRecv: ISODate ("2020-03-22T13:04:23.967Z")

"pingMs": NumberLong (0)

"lastHeartbeatMessage":

"syncingTo": "192.168.2.222 purl 27017"

"syncSourceHost": "192.168.2.222 purl 27017"

"syncSourceId": 0

"infoMessage":

"configVersion": 2

}

{

"_ id": 2

"name": "192.168.2.188 purl 27017"

"health": 1

"state": 7

"stateStr": "ARBITER"

"uptime": 122

LastHeartbeat: ISODate ("2020-03-22T13:04:24.019Z")

LastHeartbeatRecv: ISODate ("2020-03-22T13:04:24.112Z")

"pingMs": NumberLong (0)

"lastHeartbeatMessage":

"syncingTo":

"syncSourceHost":

"syncSourceId":-1

"infoMessage":

"configVersion": 2

}

]

"ok": 1

"$clusterTime": {

ClusterTime: Timestamp (1584882264, 1)

"signature": {

"hash": BinData (0, "AAAAAAAAAAAAAAAAAAAAAAAAAAA=")

"keyId": NumberLong (0)

}

}

OperationTime: Timestamp (1584882264, 1)

}

187 nodes automatically become slave nodes

Cjcmonset:PRIMARY >

Cjcmonset:PRIMARY >

Cjcmonset:SECONDARY >

Cjcmonset:SECONDARY >

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report