Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

An example of the problem that oracle 12c database instance monitoring cannot register

2025-03-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

After a database restart, it is found that the instance service has been unable to register, but only the service registration of the asm instance:

Lsnrctl statusLSNRCTL for Linux: Version 12.2.0.1.0-Production on 17-JAN-2020 19:43:44Listening Endpoints Summary... (DESCRIPTION= (ADDRESS= (PROTOCOL=ipc) (KEY=LISTENER) (DESCRIPTION= (ADDRESS= (PROTOCOL=tcp) (HOST=127.0.0.1) (PORT=1521) (DESCRIPTION= (ADDRESS= (PROTOCOL=tcp) (HOST=xxxx) (PORT=1521) Services Summary...Service "+ ASM" has 1 instance (s). Instance "+ ASM1", status READY, has 1 handler (s) for this service...Service "+ ASM_DATA" has 1 instance (s). Instance "+ ASM1", status READY, has 1 handler (s) for this service...Service "+ ASM_MGMT" has 1 instance (s). Instance "+ ASM1", status READY, has 1 handler (s) for this service...Service "+ ASM_OCR" has 1 instance (s). Instance "+ ASM1", status READY, has 1 handler (s) for this service...The command completed successfully

It is up to the lreg process to register the listening service in ORACLE 12C. At this time, I use strace to track whether there is an exception in the lreg process and find that timeout continues to occur in POLL:

Epoll_wait (9, [], 1024, 3000) = 0poll ([{fd=4, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=6, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=12, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=7, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}], 4, 0) = 0 (Timeout) getrusage (0x1 / * RUSAGE_??? * /, {ru_utime= {0, 66310}, ru_stime= {0, 31995},...}) = 0poll ([{fd=4, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND, events=POLLIN | events=POLLIN | POLLPRI | POLLPRI}, {POLLPRI, POLLRDNORM | POLLRDNORM}, {POLLRDNORM, POLLRDBAND | |) 4, 0) = 0 (Timeout) epoll_wait (9, [], 1024, 3000) = 0poll ([{fd=4, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=6, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=12, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=7, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}], 4, 0) = 0 (Timeout) getrusage (0x1 / * RUSAGE_??? * /, {ru_utime= {0, 66310}, ru_stime= {0, 32157},...}) = 0poll ([{fd=4, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=6, events=POLLIN | POLLPRI}, {POLLPRI}, {POLLPRI Events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=7, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}], 4,0) = 0 (Timeout) epoll_wait (9, [], 1024, 3000) = 0poll ([{fd=4, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=6, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=12, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=7, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}], 4,0) = 0 (Timeout) getrusage (0x1 / * RUSAGE_??? * /, {RUSAGE_??? {0, 66310}, ru_utime= {0, 32271},...}) = ru_utime= ("/ ru_utime=") O_RDONLY) = 13fstat (13, {st_mode=S_IFREG | 0444, st_size=0,...}) = 0mmap (NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS,-1,0) = 0x7f1a71503000read (13, "0.16 0.20 0.33 4 st_mode=S_IFREG 1395 210929\ n", 1024) = 29close (13) = 0munmap (0x7f1a71503000, 4096) = 0poll ([{fd=4, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=6, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}) {fd=12, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=7, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}], 4,0) = 0 (Timeout) epoll_wait (9, [], 1024, 3000) = 0poll ([{fd=4, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=6, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=12, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=7, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}], 4,0) = 0 (Timeout) getrusage (0x1 / * 0x1 * /, {RUSAGE_??? {0, 66310}, RUSAGE_??? {0, 32503},...}) = RUSAGE_??? ([{ru_utime=) Events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=6, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=12, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}, {fd=7, events=POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND}], 4,0) = 0 (Timeout)

The description of poll is as follows:

Poll is a function prototype of query method: int poll (struct pollfd * fds, nfds_t nfds, int timeout); fds is an array of device files to be queried; nfds describes how many devices are in the first parameter fds; timeout is the sleep time of the process that cannot reach the desired results. Return value: query the number of device files in the desired state function description: call poll to query the status of the file in the application, first fd each device file in the fds, call their driver poll function, query whether there is our desired state, query all the device files in the fds to get the number of device files that meet the desired state, if the number is 0 Then the poll call will cause the process to go to sleep, and the sleep time is set by the poll function. If the program shows our desired state in a file of fds in the sleep state, then poll returns immediately, otherwise sleep until the end of sleep time, the return value is 0 If this number is greater than 0, poll returns the number of devices that meet the criteria. Poll is equivalent to open ("/ dev/xxx", O_RDWR) blocking open files, the difference is that when the device file has no data to read, poll only causes the program to sleep for a fixed time, while open will cause the program to sleep until the data is available.

At this point, I wondered if there was an exception in the process, so I restarted the data through sqlplus, then tracked the lreg process again, and found that the poll function timeout no longer appeared:

Epoll_wait (9, [], 1024, 3000) = 0getrusage (0x1 / * RUSAGE_??? * /, {ru_utime= {0, 11203}, ru_stime= {0, 21388},...}) = 0epoll_wait (9, [], 1024, 3000) = 0getrusage (0x1 / * RUSAGE_??? * /, {ru_utime= {0, 11234}, ru_stime= {0, 21447},...}) = 0epoll_wait (9, [] 1024, 3000) = 0getrusage (0x1 / * RUSAGE_??? * /, {ru_utime= {0, 11264}, ru_stime= {0, 21505},...}) = 0

However, database instance monitoring is still unable to register for monitoring.

At this point, I am confused. Listening on the service that can register the asm instance indicates that there should be no problem with monitoring, and that the lreg process of the database can register continuously, indicating that there is no problem with registration. There should be any exception between this.

So I used oradebug Event 10257 to track the lreg process:

* * 2020-01-17T20:17:21.365862+08:00 (CDB$ROOT (1)) kmlwait: status: succ=0, wait=0, fail=0kmmlrl: update for process drop delta: 357 357 149 150 5999kmmlrl: 149 processeskmmlrl: instance load 2kmmgdnu: O12DB goodness=0, delta=1, pdb=1, flags=0x104:unblocked/not overloaded, update=0x2:G/-/-kmmgdnu: O12DBXDB goodness=0, delta=1, pdb=1, flags=0x105:unblocked/not overloaded Update=0x2:G/-/-kmmlrl_network_hdlr_state: updatekmmlrl_network_hdlr_state: update for network'- oracledefault-'kmmlrl_network_hdlr_state: beq handler: load=149, max=5999, flag=0x2002, upd=0x2--Start Registration Information-- Last update: 53704792 (3 seconds ago) Flag: 0x4, 0x0State: succ=0, wait=0 Fail=0 CDB: root pdb 1 last pdb 4098 open max pdb 2 Dispatcher configuration index: cur 1 max 1 Network'- oracledefault-' pdb 1: Local listeners: Remote listeners: Handlers: Dedicated flg=0x2002, upd=0x2, srvl=1 services=O12DB hdlr load=149 Max=5999 nam=DEDICATED adr= (ADDRESS= (PROTOCOL=BEQ) (PROGRAM=/app/oracle/product/12.2.0/dbhome_1/bin/oracle) (ARGV0='oracle./O12DB1') (ARGS=' (LOCAL=NO)') inf=LOCAL SERVER pri=0x7fea7aa8a208 * * 2020-01-17T20:17:21.365862+08:00 (CDB$ROOT (1)) kmlwait: status: succ=0, wait=0, fail=0kmmlrl: update for process drop delta: 357 357 149 150 5999kmmlrl: instance load 2kmmgdnu: O12DB goodness=0, delta=1, pdb=1 Flags=0x104:unblocked/not overloaded, update=0x2:G/-/-kmmgdnu: O12DBXDB goodness=0, delta=1, pdb=1, flags=0x105:unblocked/not overloaded, update=0x2:G/-/-kmmlrl_network_hdlr_state: updatekmmlrl_network_hdlr_state: update for network'- oracledefault-'kmmlrl_network_hdlr_state: beq handler: load=149, max=5999, flag=0x2002 Upd=0x2--Start Registration Information-- Last update: 53704792 (3 seconds ago) Flag: 0x4, 0x0State: succ=0, wait=0, fail=0

Here, it is found that the variables such as Local listeners: and Remote listeners: are empty. Looking at the local_listener parameter of the database, it is found that the exception is currently oraagent-dummy:

SQL > show parameter localNAME TYPE VALUE-- local_listener string-oraagent-dummy-

At this point, check the status of crs resources again and find that the status of instance 1 is offline. That's because I started the database directly from sqlplus, not from cluster resources.

Ora.o12db.db

1 ONLINE OFFLINE STABLE

Therefore, after starting with srvctl, the cluster resource becomes normal, and the database instance listening is correctly registered to listen:

Services Summary...Service "+ ASM" has 1 instance (s). Instance "+ ASM1", status READY, has 1 handler (s) for this service...Service "+ ASM_DATA" has 1 instance (s). Instance "+ ASM1", status READY, has 1 handler (s) for this service...Service "+ ASM_MGMT" has 1 instance (s). Instance "+ ASM1", status READY, has 1 handler (s) for this service...Service "+ ASM_OCR" has 1 instance (s). Instance "+ ASM1", status READY, has 1 handler (s) for this service...Service "O12DB" has 1 instance (s). Instance "O12DB1", status READY, has 1 handler (s) for this service...Service "O12DBXDB" has 1 instance (s). Instance "O12DB1", status READY, has 1 handler (s) for this service...The command completed successfully

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report