What is the impact of MONGO cluster after modifying linux host time? 07/15 Update SLTechnology News&Howtos

What is the impact of MONGO cluster after modifying linux host time?

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

Editor to share with you what is the impact of MONGO cluster modification of linux host time, I hope you will gain something after reading this article, let's discuss it together!

The production environment is an one-master-slave-arbitration 3-slice cluster, and now it is found that one of the nodes is several days larger than the current time, and then use NTP to adjust the time back to the replica set.

The original time was May 3, and it is now April 26, which has been adjusted.

[root@cwdtest1 bin] # date

Fri May 3 13:20:31 CST 2019

[root@cwdtest1 bin] # ntpdate-u 10.205.34.171

26 Apr 12:39:23 ntpdate [14568]: step time server 10.205.34.171 offset-607507.747595 sec

[root@cwdtest1 bin] # hwclock-- systohc

Current time after adjustment:

[root@cwdtest1 bin] # date

Fri Apr 26 12:39:31 CST 2019

When the adjustment time is complete, two problems are found:

1. The replica set could not synchronize the new oplog, resulting in a delay

Shard2:PRIMARY > db.printSlaveReplicationInfo ()

Source: 10.3.252.231:27002

SyncedTo: Fri May 03 2019 13:24:23 GMT+0800 (CST)

8 secs (0 hrs) behind the primary

two。 The tLast time for viewing oplog is longer than the current one.

Shard2:PRIMARY > db.getReplicationInfo ()

{

"logSizeMB": 1383.3396482467651

"usedMB": 154.49

"timeDiff": 17015711

"timeDiffHours": 4726.59

TFirst: "Thu Oct 18 2018 14:49:20 GMT+0800 (CST)"

TLast: "Fri May 03 2019 13:24:31 GMT+0800 (CST)"

Now: "Fri Apr 26 2019 13:57:01 GMT+0800 (CST)"

}

Shard2:PRIMARY > db.printReplicationInfo ()

Configured oplog size: 1383.3396482467651MB

Log length start to end: 17015711secs (4726.59hrs)

Oplog first event time: Thu Oct 18 2018 14:49:20 GMT+0800 (CST)

Oplog last event time: Fri May 03 2019 13:24:31 GMT+0800 (CST)

Now: Fri Apr 26 2019 15:46:27 GMT+0800 (CST)

Looking at the db.getReplicationInfo, where do we find out where we got the tLast and now time?

Shard2:PRIMARY > db.getReplicationInfofunction () {var localdb = this.getSiblingDB ("local"); var result = {}; var oplog; var localCollections = localdb.getCollectionNames (); if (localCollections.indexOf ('oplog.rs') > = 0) {oplog =' oplog.rs';} else if (localCollections.indexOf ('oplog.$main') > = 0) {oplog =' oplog.$main' } else {result.errmsg = "neither master/slave nor replica set replication detected"; return result;} var ol = localdb.getCollection (oplog); var ol_stats = ol.stats (); if (ol_stats & & ol_stats.maxSize) {result.logSizeMB = ol_stats.maxSize / (1024 * 1024) } else {result.errmsg = "Could not get stats for local." + oplog + "collection. "+" collstats returned: "+ tojson (ol_stats); return result;} result.usedMB = ol_stats.size / (1024 * 1024); result.usedMB = Math.ceil (result.usedMB * 1024) / 100; var firstc = ol.find () .sort ({$natural: 1}) .limit (1) Var lastc = ol.find () .sort ({$natural:-1}) .limit (1); if (! firstc.hasNext () | |! lastc.hasNext ()) {result.errmsg = "objects not found in local.oplog.$main-- is this a new and empty db instance?"; result.oplogMainRowCount = ol.count (); return result } var first = firstc.next (); var last = lastc.next (); var tfirst = first.ts; var tlast = last.ts; if (tfirst & & tlast) {tfirst = DB.tsToSeconds (tfirst); tlast = DB.tsToSeconds (tlast); result.timeDiff = tlast-tfirst Result.timeDiffHours = Math.round (result.timeDiff / 36) / 100; result.tFirst = (new Date (tfirst * 1000)) .toString (); result.tLast = (new Date (tlast * 1000)) .toString (); result.now = Date ();} else {result.errmsg = "ts element not found in oplog objects";} return result;}

It can be seen from the above:

Var ol = localdb.getCollection (oplog)

Var lastc = ol.find () .sort ({$natural:-1}) .limit (1)

Var last = lastc.next ()

Var tlast = last.ts

Result.tLast = (new Date (tlast * 1000)) .toString ()

Result.now = Date ()

The time of tLast is the ts time of getting the last piece of data in the oplog.rs collection.

The time of Now is to call the Date () function to get the current time.

Therefore, at this time, I suspect that the replica set cannot be synchronized because logs larger than the current time are stored in the oplog, and the newly generated oplog log records are not up-to-date when the time is adjusted, so the replica set cannot be synchronized when it finds that the latest logs remain unchanged during comparison.

Briefly talk about the mechanism of mongodb synchronization (learn from the network):

1. When executing a write statement, complete the write operation on the primary

two。 Record an oplog log on primary, which contains a ts field with a value of the time when the write operation was executed, such as t in this case

3.secondary pulls oplog from primary and gets the log of the write operation just now.

4.secondary performs the corresponding write operation according to the obtained log

5. After the execution is completed, secondary gets the new log, and its condition for pulling oplog to primary is {ts: {$gt:t}}.

6.primary received a request from secondary at this time and learned that secondary was writing a log with a request time greater than t, so he knew that all the logs before t had been successfully executed.

So I ran an insertion test on primary to verify the suspicion.

Shard2:PRIMARY > use shtest

Switched to db shtest

Shard2:PRIMARY > db.coll.insert ({XRV 3339876})

WriteResult ({"nInserted": 1})

Query the last operation record of the master node:

Rs.debug.getLastOpWritten ()