In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces "how to understand the lmd process of oracle rac". In the daily operation, I believe that many people have doubts about how to understand the lmd process of oracle rac. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "how to understand the lmd process of oracle rac". Next, please follow the editor to study!
Conclusion
1. The test environment is oracle 10.2.0.1 rac
2 if the RAC process is interrupted abnormally, it will cause the instance to restart, and a SYSTEMSTATE DUMP file will be generated before shutting down the library.
3 the lmd process monitors the lmd process, that is, if the Lmon process dies, it will be restarted by the lmon process
4the LMD process is responsible for the global queue service, namely GES, which, to put it bluntly, is to manage resource requests across multiple instances of RAC. This shows the importance of the LMD process. If the LMD fails, the database DML operation will HANG.
In turn, it will cause the delay of IPC communication between RAC nodes.
5Magi IPC communication delay will generate the corresponding TRACE FILE of LMD
test
-- lmd meaning
The lmd process is the process responsible for the global queue service, that is, GES
It is responsible for resource requests from remote RAC nodes for each RAC instance; and it is a DAEMON process, that is, it is protected by a monitoring process, and if it does not exist, it is restarted by the monitoring process
It can be seen that if the lmd process is interrupted abnormally, it will directly cause the RAC node to be forcibly shut down, and a systemstate dump will be generated for analysis before shutting down the instance.
[oracle@jingfa1 ~] $ps-ef | grep lmd
Oracle 4774 1 0 Nov09? 00:00:31 asm_lmd0_+ASM1
Oracle 11220 1 0 02:13? 00:00:15 ora_lmd0_jingfa1
Oracle 30706 30376 0 05:19 pts/3 00:00:00 grep lmd
[oracle@jingfa1] $kill-9 11220
Tue Nov 10 05:20:03 2015
Errors in file / u01/app/oracle/admin/jingfa/bdump/jingfa1_pmon_11212.trc:
ORA-00482: LMD* process terminated with error
Tue Nov 10 05:20:03 2015
PMON: terminating instance due to error 482
Tue Nov 10 05:20:03 2015
Errors in file / u01/app/oracle/admin/jingfa/bdump/jingfa1_lms0_11222.trc:
ORA-00482: LMD* process terminated with error
Tue Nov 10 05:20:03 2015
System state dump is made for local instance
System State dumped to trace file / u01/app/oracle/admin/jingfa/bdump/jingfa1_diag_11214.trc
Tue Nov 10 05:20:03 2015
Trace dumping is performing id= [cdmp _ 20151110052003]
Tue Nov 10 05:20:08 2015
Instance terminated by PMON, pid = 11212
-- immediately after the instance, it will restart automatically.
Tue Nov 10 05:21:05 2015
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
It can be seen that the lmd process will restart automatically.
[oracle@jingfa1 ~] $ps-ef | grep lmd
Oracle 3474 30376 0 05:23 pts/3 00:00:00 grep lmd
Oracle 4774 1 0 Nov09? 00:00:31 asm_lmd0_+ASM1
Oracle 32703 1 0 05:21? 00:00:00 ora_lmd0_jingfa1
It is said that the health of the lmd process is responsible for its monitoring process. The official manual is the lmon process. The LMON process is responsible for the cross-instance or global queue and resource management of each RAC instance, as well as the recovery operation of the global queue lock.
[oracle@jingfa1 bdump] $ps-ef | grep lmon
Oracle 4772 1 0 Nov09? 00:00:29 asm_lmon_+ASM1
Oracle 19857 30376 0 05:34 pts/3 00:00:00 grep lmon
Oracle 32701 1 0 05:21? 00:00:02 ora_lmon_jingfa1
[oracle@jingfa1 bdump] $kill-9 32701
It can be seen that if the LMON is interrupted abnormally, the LMD process to which it belongs will also be forced to shut down.
[oracle@jingfa1 bdump] $ps-ef | grep lmd
Oracle 4774 1 0 Nov09? 00:00:32 asm_lmd0_+ASM1
Oracle 21171 30376 0 05:34 pts/3 00:00:00 grep lmd
It can be seen that whenever the lmon process is interrupted abnormally, the database instance will be forced to restart.
Tue Nov 10 05:34:18 2015
Errors in file / u01/app/oracle/admin/jingfa/bdump/jingfa1_pmon_32695.trc:
ORA-00481: LMON process terminated with error
Tue Nov 10 05:34:18 2015
PMON: terminating instance due to error 481
Tue Nov 10 05:34:18 2015
System state dump is made for local instance
System State dumped to trace file / u01/app/oracle/admin/jingfa/bdump/jingfa1_diag_32697.trc
Tue Nov 10 05:34:18 2015
Trace dumping is performing id= [cdmp _ 20151110053418]
Tue Nov 10 05:34:23 2015
Instance terminated by PMON, pid = 32695
Tue Nov 10 05:35:19 2015
Starting ORACLE instance (normal)
It can be seen that lmon and lmd will restart automatically.
[oracle@jingfa1 bdump] $ps-ef | grep lmon
Oracle 4772 1 0 Nov09? 00:00:30 asm_lmon_+ASM1
Oracle 21820 1 0 05:35? 00:00:01 ora_lmon_jingfa1
Oracle 27926 30376 0 05:39 pts/3 00:00:00 grep lmon
[oracle@jingfa1 bdump] $ps-ef | grep lmd
Oracle 4774 1 0 Nov09? 00:00:33 asm_lmd0_+ASM1
Oracle 21822 1 0 05:35? 00:00:00 ora_lmd0_jingfa1
Oracle 28028 30376 0 05:39 pts/3 00:00:00 grep lmd
By extension, that is to say, it is certain that there will be some mechanism at the operating system level to ensure that lmon and lmd processes will restart when they are abnormally interrupted. What is this mechanism?
After analyzing the processes at the operating system level, mainly under / etc/init.d, it is found that lmon and its lmd belong to the ORACLE level, not the cluster level, and there is no corresponding process to control them.
Let's look at it another way. What are the parameters related to the lmd process and what do they mean?
NAME_1 VALUE_1 DESC1
-
_ lm_lmd_waittime 8 default waittime for lmd in centiseconds
-node1
SQL > select addr,program,username,pid,spid from v$process where username='oracle' and spid=21822
ADDR PROGRAM USERNAME PID SPID
-
0000000083A585C8 oracle@jingfa1 (LMD0) oracle 6 21822
-- node2
SQL > select addr,program,username,pid,spid from v$process where username='oracle' and spid=668
ADDR PROGRAM USERNAME PID SPID
-
0000000083A585C8 oracle@jingfa2 (LMD0) oracle 6 668
-- node2
SQL > conn tbs_zxy/system
Connected.
SQL > update t_lock set axiom 11 where astat1
1 row updated.
-- node1
SQL > update t_lock set axiom 1111 where astat1
-- hang residence
It can be seen that the above parameters are not directly related to lock detection, but lmd is related to global locks.
To put it another way, what if oradebug simulates pausing lmd and then generates a global lock?
-node1
Pause lmd
SQL > oradebug setospid 21822
Oracle pid: 6, Unix process pid: 21822, image: oracle@jingfa1 (LMD0)
SQL > oradebug suspend
Statement processed.
Tue Nov 10 06:03:44 2015
Unix process pid: 21822, image: oracle@jingfa1 (LMD0) flash frozen
-node2
Pause lmd
SQL > oradebug setospid 668
Oracle pid: 6, Unix process pid: 668, image: oracle@jingfa2 (LMD0)
SQL > oradebug suspend
Statement processed.
Tue Nov 10 06:06:08 2015
Unix process pid: 668, image: oracle@jingfa2 (LMD0) flash frozen
-node2
SQL > update t_lock set axiom 11 where astat1
1 row updated.
-- node1
SQL > update t_lock set axiom 1111 where astat1
-- hang residence
Now start to observe the alarm logs of node 1 and node 2
-- node2
Tue Nov 10 06:09:42 2015
IPC Send timeout detected.Sender: ospid 682-visible that the sending process is a SMON process
Receiver: inst 1 binc 432326879 ospid 21822-you can see that the recipient is the LMD process of NODE1
Tue Nov 10 06:09:45 2015
IPC Send timeout to 0.0 inc 20 for msg type 12 from opid 12-same as above, the recipient is also a SMON process
Tue Nov 10 06:09:45 2015
Communications reconfiguration: instance_number 1
Tue Nov 10 06:09:45 2015
IPC Send timeout detected.Sender: ospid 696-it can be seen that the MMON process is the sending process
Receiver: inst 1 binc 432326879 ospid 21822-it can be seen that the receiving process is the lmd process of the node
Tue Nov 10 06:09:48 2015
IPC Send timeout to 0.0 inc 20 for msg type 12 from opid 15-same as above, the recipient is the mmon sending process
-- node1
Tue Nov 10 06:09:23 2015
IPC Send timeout detected. Receiver ospid 21822-visible acceptance as a LMD process
Tue Nov 10 06:09:23 2015
Errors in file / u01/app/oracle/admin/jingfa/bdump/jingfa1_lmd0_21822.trc:-- generate a TRACE file for LMD
IPC Send timeout detected. Receiver ospid 21822-ditto
Tue Nov 10 06:09:27 2015
Errors in file / u01/app/oracle/admin/jingfa/bdump/jingfa1_lmd0_21822.trc:
It can be seen from the above that lmd is indeed related to global lock acquisition. If the LMD process fails, it will lead to communication problems among RAC2 nodes.
[oracle@jingfa2 bdump] $ps-ef | grep 682
Oracle 682 1 0 02:14? 00:00:01 ora_smon_jingfa2
Oracle 7157 13004 0 06:15 pts/1 00:00:00 grep 682
SQL > select spid,pid,program from v$process where spid=696
SPID PID PROGRAM
-
696 15 oracle@jingfa2 (MMON)
At this point, the study on "how to understand the lmd process of oracle rac" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.