In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Gdb tracking checkpointer process, there is a deadlock, Mark.
Track checkpointer processes and view messages in shared memory (heckpointerShmem- > requests)
(gdb) p CheckpointerShmem- > requests [1663]... $16 = {rnode = {spcNode = 1663, dbNode = 16402, relNode = 26185}, forknum = MAIN_FORKNUM, segno = 0} (gdb) p CheckpointerShmem- > requests [200] Cannot access memory at address 0xf9fb18.
Then, the process that requests the checkpoint reports an error
Testdb=# update t_wal_ckpt set c2 = 'C2Jing' | | substr (c2jing4 Magazine 40); UPDATE 8192 testdb=# checkpoint; 2019-01-07 12 UPDATE 30 testdb=# checkpoint; 32.114 CST [1418] PANIC: stuck spinlock detected at RequestCheckpoint, checkpointer.c:1050 2019-01-07 12 Fringe 30 UPDATE 32.114 CST [1418] STATEMENT: checkpoint 2019-01-07 12 stuck spinlock detected at FirstCallSinceLastCheckpoint 30 37.081 CST [1390] Checkpointer.c:1376 2019-01-07 12 checkpointer.c:1376 30 CST 38.610 CST [1370] LOG: background writer process (PID 1390) was terminated by signal 6: Aborted 2019-01-07 12 WARNING 30 CST [1392] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit Because another server process exited abnormally and possibly corrupted shared memory. 2019-01-07 12 In a moment you should be able to reconnect to the database and repeat your command 30 CST 38.611 HINT: In a moment you should be able to reconnect to the database and repeat your command. 2019-01-07 12 CST 30 CST [1558] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. WARNING: terminating connection because of crash of another server process 2019-01-07 12. 2019-01-07 12 In a moment you should be able to reconnect to the database and repeat your command 30 CST 38.613 HINT: In a moment you should be able to reconnect to the database and repeat your command. PANIC: stuck spinlock detected at RequestCheckpoint, checkpointer.c:1050 server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: 2019-01-07 12 FATAL: the database system is in recovery mode Failed. ! >
Try to reconnect and find that DB has been coredump.
[xdb@localhost ~] $[xdb@localhost ~] $psql-d testdb 2019-01-07 14 psql CST [1629] FATAL: the database system is in recovery mode psql: FATAL: the database system is in recovery mode
Perform recovery
[xdb@localhost ~] $pg_ctl start pg_ctl: another server might be running; trying to start server anyway waiting for server to start....2019-01-07 14 another server might be running; trying to start server anyway waiting for server to start....2019 11 CST 50.821 CST [1632] FATAL: lock file "postmaster.pid" already exists 2019-01-07 14 14 another server might be running; trying to start server anyway waiting for server to start....2019 50.821 CST [1632] HINT: Is another postmaster (PID 1370) running in data directory "/ data/xdb/pg111db"? Stopped waiting pg_ctl: could not start server Examine the log output. [xdb@localhost ~] $find / data/xdb-name postmaster.pid / data/xdb/pg111db/postmaster.pid [xdb@localhost ~] $rm-rf / data/xdb/pg111db/postmaster.pid [xdb@localhost ~] $pg_ctl start waiting for server to start....2019-01-07 14 purge 44.578 CST [1639] LOG ":: 1": Address already in use [xdb@localhost ~] $ps-ef | grep postgres xdb 1370 1 0 12:01 pts/0 00:00:02 / appdb/atlasdb/pg11.1/bin/postgres xdb 1389 1370 0 12:01? 00:00:00 [postgres] xdb 1641 1332 0 14:12 pts/0 00:00:00 grep-- color=auto postgres [xdb@localhost ~] $kill-9 1370 [xdb@localhost ~] $pg_ctl start waiting for server to start....2019-01-07 14 xdb 33.125 CST [1648] LOG: listening on IPv6 address ":: 1" Port 5432 2019-01-07 14 listening on IPv4 address 1315 33.125 CST [1648] LOG: listening on IPv4 address "127.0.0.1", port 5432 2019-01-07 14 LOG 1315 33. 142 CST [1648] LOG: listening on Unix socket "/ tmp/.s.PGSQL.5432". 2019-01-07 14 13 listening on IPv4 address 361 CST [1649] LOG: database system was interrupted Last known up at 2019-01-07 12:26:22 CST 2019-01-07 14 purl 13purl 34.818 CST [1649] LOG: database system was not properly shut down Automatic recovery in progress 2019-01-07 14 LOG 13 redo starts at 1/48F9ED08 34.863 CST [1649] LOG: redo starts at 1/48F9ED08. 2019-01-07 14 LOG 1313 charge 35.467 CST [1649] LOG: invalid record length at 1/4914FF58: wanted 24 Got 0 2019-01-07 14 LOG 13 purl 35.467 CST [1649] LOG: redo done at 1/4914FF30 2019-01-07 14 purge 1313 purl 35.467 CST [1649] LOG: last completed transaction was at log time 2019-01-07 12 Vera 28 Vera 37.521542 got 08 2019-01-07 14 purge 1335.977 CST [1648] LOG: database system is ready to accept connections done server started
After analysis, it is caused by the CheckpointerShmem- > ckpt_lck in the shared memory structure.
When tracking the checkpointer process, execute
SpinLockRelease (& CheckpointerShmem- > ckpt_lck)
After the release of lock, the above problem no longer occurs.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.