Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Redis redo Connection with master lost error from Library Times

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Background: my colleague gave the monitor command to rename when initializing the redis instance. Today, it just so happens that this group of instances need to use the monitor command to help analyze the problem. It is found that the monitor command cannot be used, because the redis command renaming does not support dynamic modification, so I want to delete the monitor command and restart the redis slave library as soon as possible. After the restart, we found that the master-slave replication state has been in the state of down, so we began to troubleshoot the problem.

First, take a look at the log of redis slave library:

43037:S 20 Apr 06:12:38.047 MASTER SLAVE sync started

43037:S 20 Apr 06:12:38.047 Non blocking connect for SYNC fired the event.

43037:S 20 Apr 06:12:38.048 Master replied to PING, replication can continue...

43037no cached master S 20 Apr 0612Swiss 38.048 Partial resynchronization not possible (no cached master)

43037:S 20 Apr 06:12:39.112 Full resync from master: 96f2ae75d50e1f8b69737509d5b32b2da660e7c0:885061114038

43037:S 20 Apr 06:19:16.258 MASTER SLAVE sync: receiving 20819568576 bytes from master

43037:S 20 Apr 06:20:08.203 MASTER SLAVE sync: Flushing old data

43037:S 20 Apr 06:24:36.036 MASTER SLAVE sync: Loading DB in memory

43037:S 20 Apr 06:28:18.538 MASTER SLAVE sync: Finished with success

43037:S 20 Apr 06:28:19.782 Background append only file rewriting started by pid 173002

43037:S 20 Apr 06:28:19.982 # Connection with master lost.

43037:S 20 Apr 06:28:19.982 Caching the disconnected master state.

43037:S 20 Apr 06:28:20.984 Connecting to MASTER 10.93.157.52:6385

43037:S 20 Apr 06:28:20.985 MASTER SLAVE sync started

43037:S 20 Apr 06:28:20.985 Non blocking connect for SYNC fired the event.

43037:S 20 Apr 06:28:20.985 Master replied to PING, replication can continue...

43037 Apr S 20 Rd 06 28 Apr 20.986 Trying a partial resynchronization (request 96f2ae75d50e1f8b69737509d5b32b2da660e7c0:885062375607)

43037:S 20 Apr 06:28:22.073 Full resync from master: 96f2ae75d50e1f8b69737509d5b32b2da660e7c0:885240485270

43037:S 20 Apr 06:28:22.073 Discarding previously cached master state.

43037:S 20 Apr 06:33:30.800 # Timeout receiving bulk data from MASTER... If the problem persists try to set the 'repl-timeout' parameter in redis.conf to a larger value.

43037:S 20 Apr 06:33:30.801 Connecting to MASTER 10.93.157.52:6385

43037:S 20 Apr 06:33:30.802 MASTER SLAVE sync started

43037:S 20 Apr 06:33:30.802 Non blocking connect for SYNC fired the event.

43037:S 20 Apr 06:33:30.802 Master replied to PING, replication can continue...

43037no cached master S 20 Apr 06Groupe 3330.803 Partial resynchronization not possible (no cached master)

43037:S 20 Apr 06:34:27.458 AOF rewrite child asks to stop sending diffs.

173002:C 20 Apr 06:34:27.470 Parent agreed to stop sending diffs. Finalizing AOF...

173002:C 20 Apr 06:34:27.470 Concatenating 1.19 MB of AOF diff received from parent.

173002:C 20 Apr 06:34:27.477 SYNC append only file rewrite performed

173002:C 20 Apr 06:34:28.757 AOF rewrite: 119 MB of memory used by copy-on-write

43037:S 20 Apr 06:34:29.961 Background AOF rewrite terminated with success

43037 MB S 20 Apr 06 34 MB 29.961 MB

43037:S 20 Apr 06:34:29.961 Background AOF rewrite finished successfully

43037:S 20 Apr 06:35:08.471 Full resync from master: 96f2ae75d50e1f8b69737509d5b32b2da660e7c0:885393456775

43037:S 20 Apr 06:41:49.489 MASTER SLAVE sync: receiving 20821399746 bytes from master

43037:S 20 Apr 06:42:28.911 # I/O error trying to sync with MASTER: connection lost

43037:S 20 Apr 06:42:32.642 Connecting to MASTER 10.93.157.52:6385

43037:S 20 Apr 06:42:32.646 MASTER SLAVE sync started

43037:S 20 Apr 06:42:32.646 Non blocking connect for SYNC fired the event.

43037:S 20 Apr 06:42:32.647 Master replied to PING, replication can continue...

43037V 20 Apr 06R 42R 32.647 Partial resynchronization not possible (no cached master)

43037:S 20 Apr 06:42:33.755 Full resync from master: 96f2ae75d50e1f8b69737509d5b32b2da660e7c0:885541422071

43037:S 20 Apr 06:49:15.956 MASTER SLAVE sync: receiving 20821403571 bytes from master

43037:S 20 Apr 06:50:16.781 MASTER SLAVE sync: Flushing old data

43037:S 20 Apr 06:54:23.078 MASTER SLAVE sync: Loading DB in memory

43037:S 20 Apr 06:58:10.123 MASTER SLAVE sync: Finished with success

43037:S 20 Apr 06:58:11.317 Background append only file rewriting started by pid 223387

43037:S 20 Apr 06:58:11.536 # Connection with master lost.

43037 S 20 Apr 06Fringe 58 11.536 *.

Let's take a look at the log of the redis main library:

304369:M 20 Apr 05:13:00.197 Slave 10.93.157.16:6383 asks for synchronization

304369:M 20 Apr 05:13:00.197 Full resync requested by slave 10.93.157.16:6383

304369:M 20 Apr 05:13:00.197 Starting BGSAVE for SYNC with target: disk

304369:M 20 Apr 05:13:00.902 Background saving started by pid 366254

366254:C 20 Apr 05:19:14.460 DB saved on disk

366254:C 20 Apr 05:19:14.961 RDB: 4613 MB of memory used by copy-on-write

304369:M 20 Apr 05:19:15.579 Background saving terminated with success

304369:M 20 Apr 05:20:18.303 Synchronization with slave 10.93.157.16:6383 succeeded

304369:M 20 Apr 05:22:32.216 # Client id=1768461 addr=10.93.157.16:26912 fd=10 name= age=572 idle=0 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=11450 omem=278945640 events=rw cmd=psync scheduled to

Be closed ASAP for overcoming of output buffer limits.

304369:M 20 Apr 05:22:32.216 # Connection with slave 10.93.157.16:6383 lost.

304369:M 20 Apr 05:28:27.651 Slave 10.93.157.16:6383 asks for synchronization

304369 for lack of backlog 20 Apr 05 28 for lack of backlog 27.651 Unable to partial resync with slave 10.93.157.16 for lack of backlog (Slave request was: 884116305579).

304369:M 20 Apr 05:28:27.651 Starting BGSAVE for SYNC with target: disk

304369:M 20 Apr 05:28:28.356 Background saving started by pid 396084

304369:M 20 Apr 05:32:37.471 # Client id=1768945 addr=10.93.157.16:21854 fd=22 name= age=250 idle=250 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=16366 oll=11126 omem=271085384 events=r cmd=psync scheduled to

Be closed ASAP for overcoming of output buffer limits.

304369:M 20 Apr 05:32:37.538 # Connection with slave 10.93.157.16:6383 lost.

From the log, we can see that when the redis master library receives a request from the slave library to resynchronize the data, and then use psync to do partial synchronization, we can see that the problem lies in the partial synchronization. The client-output-buffer-limit value setting is too small, resulting in the derivative sending failure. After failure, the request for data synchronization continues to be launched from the slave library, but it fails every time. The redis master library repeatedly generates rdb files. Although there is a fork child process, it has a certain impact on the throughput of the redis master library.

Now let's solve this problem by taking a look at the current size of client-output-buffer-limit:

127.0.0.1purl 6385 > config get client-output-buffer-limit

1) "client-output-buffer-limit"

2) "normal 000 slave 268435456 67108864 60 pubsub 33554432 8388608 60"

As you can see, the current limit is 256m and no more than 64m within 60s. From our log information above, we can see that the data of psync is obviously larger than 256m. We make the following settings to enlarge the very limit:

Config set client-output-buffer-limit 'slave 1073741824 268435456 60'

After adjustment, and then observe, it is found that the replication status of the slave library will soon become up.

Our above adjustment is to increase the copy output buffer, and another way is to turn off the limit of the copy output buffer:

Config set client-output-buffer-limit 'slave 0000'

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report