Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

/ the process caused by limits.conf Oracle bug is not enough

2025-04-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

When checking SMIDB today, we found a lot of errors in the alarm log of CRS, such as:

2015-08-19 17 1215 21.745:

[/ oracle/app/11.2.0/grid_1/bin/oraagent.bin (6227)] CRS-5013:Agent "/ oracle/app/11.2.0/grid_1/bin/oraagent.bin" failed to start process "oracle/app/11.2.0/grid_1/bin/lsnrctl" for action "check": details at "(: CLSN00008:)" in "/ oracle/app/11.2.0/grid_1/log/smidb11/agent/crsd/oraagent_grid/ Oraagent_grid.log "2015-08-19 17 CRS-5013:Agent" / oracle/app/11.2.0/grid_1/bin/oraagent.bin "failed to start process" / oracle/app/11.2.0/grid_1/bin/lsnrctl "for action" check ": details at" (: CLSN00008:) "in" / oracle/app/11. 2.0/grid_1/log/smidb11/agent/crsd/oraagent_grid/oraagent_grid.log "2015-08-19 17 CRS-5013:Agent 21.758: [/ oracle/app/11.2.0/grid_1/bin/oraagent.bin (6227)] CRS-5013:Agent" / oracle/app/11.2.0/grid_1/bin/oraagent.bin "failed to start process" / oracle/app/11.2.0/grid_1/bin/lsnrctl "for action" check ": details at" (: CLSN00008:) "in" / oracle/app/11.2.0/grid_1/log/smidb11/agent/crsd/oraagent_grid/oraagent_grid.log "

Further trace the log and discover:

2015-08-19 17 clsn_agent::check 14 clsn_agent::check: Exception SclsProcessSpawnException2015-08-19 17 clsn_agent::check 21.744: [ora.asm] {0:21:2} [check] CrsCmd::ClscrsCmdData::stat entity 1 statflag 33 useFilter 02015-08-19 17 clsn_agent::check 21.744: [ora.asm] [1342174976] {0:21:2} [check AsmProxyAgent::check clsagfw_res_status 02015-08-19 17 check 14 Utils:execCmd action 21.761: [ora.LISTENER_SCAN1.lsnr] [1339545344] {0:21:2} [check] Utils:execCmd action = 3 flags = 38 ohome = (null) cmdname = lsnrctl. 2015-08-19 17 CLSN00008 14 Utils:execCmd scls_process_spawn 21. 761: [ora.LISTENER_SCAN1.lsnr] [1339545344] {0:21:2} [check] (: CLSN00008:) Utils:execCmd scls_process_spawn () failed 12015-08-19 17 17 14 Utils:execCmd scls_process_spawn 21.761: [ora.LISTENER_SCAN1.lsnr] [1339545344] {0:21:2} [check] (: CLSN00008:) category:-2, operation: fork, loc: spawnproc28, OS error: 11 Other: forked failed [- 1] 2015-08-19 17 ora.LISTENER_SCAN1.lsnr 14 check 21.761: [ora.LISTENER_SCAN1.lsnr] [1339545344] {0:21:2} [check] clsnUtils::error Exception type=2 string=CRS-5013 "/ oracle/app/11.2.0/grid_1/bin/oraagent.bin" failed to start process "/ oracle/app/11.2.0/grid_1/bin/lsnrctl" for action "check": details at "(: CLSN00008:)" in "/ Oracle/app/11.2.0/grid_1/log/smidb11/agent/crsd/oraagent_grid/oraagent_grid.log "

ONS's log:

[grid@smidb11 logs] $tail ons.out pthread_create () Resource temporarily unavailablepthread_create () Resource temporarily unavailable [2015-05-07T03:09:22+08:00] [ons] [TRACE:2] [] [internal] ONS worker process stopped (0)

This error indicates that the process cannot be started due to insufficient system resources. Check the ulimit settings.

[grid@smidb11 logs] $ulimit-u10240

Limit.conf

# End of filegrid soft nproc 10240grid hard nofile 65536oracle soft nproc 10240oracle hard nofile 65536

There are some problems with limit.conf configuration. Hard nproc and soft nofle are not configured, which will be fixed before restart next Monday.

[grid@smidb11 pam.d] $cat login #% PAM-1.0auth [user_unknown=ignore success=ok ignore=ignore default=bad] pam_securetty.soauth include system-authaccount required pam_nologin.soaccount include system-authpassword include system-auth# pam_selinux.so close should be the first session rulesession required pam_selinux.so closesession required pam_loginuid.sosession optional pam_console.so# pam_selinux.so open should only be followed by sessions to be executed In the user contextsession required pam_selinux.so opensession required pam_namespace.sosession optional pam_keyinit.so force revokesession include system-auth-session optional pam_ck_connector.so [grid@smidb11 pam.d] $

The / etc/pam.d/login file does not add a resource restriction module, so a line should be added here

Session required / lib64/security/pam_limits.so

After searching for information on the Internet, I found a document on Oracle MOS, which is exactly the same as our situation:

The processes and resources started by CRS (Grid Infrastructure) do not inherit the ulimit setting for "max user processes" from / etc/security/limits.conf setting (document ID 1594606.1)

Through verification, it is found that although our grid user's ulimit-u has been set to 10240. But the actual operation time is still 1024.

This is a Bug 17301761 of Oracle, and our database version is 11.2.0.4, which is exactly the scope of influence of this bug.

There are two solutions.

1. Make a patch

two。 Avoid it through the methods given by MOS, as follows:

The ohasd script needs to be modified to setthe ulimit explicitly for all grid and database resources that are started bythe Grid Infrastructure (GI).

1) go to GI_HOME/bin

2) make a backup of ohasd script file

3) in the ohasd script file, locate thefollowing code:

Linux)

# MEMLOCK limit is for Bug 9136459

Ulimit-l unlimited

If ["$?"! = "0"]

Then

$CLSECHO-phas-f crs-l-m 6021 "l"unlimited"

Fi

Ulimit-c unlimited

If ["$?"! = "0"]

Then

$CLSECHO-phas-f crs-l-m 6021 "c"unlimited"

Fi

Ulimit-n 65536

In the above code, insert the following linejust before the line with "ulimit-n 65536"

Ulimit-u 16384

4) Recycle CRS manually so that the ohasdwill not use new ulimit setting for open files.

After the database is started, please issue "ps-ef | grep pmon" andget the pid of it.

Then, issue "cat / proc//limits | grepprocess" and find out if the Max process is set to 16384.

Setting the number of processes to 16384 should be enough for most serverssince having 16384 processes normally mean the server to loaded veryheavily. Using smaller number like 4096 or 8192 should also suffice formost users.

In addition to above, the ohasd template needs to be modified to insure thatnew ulimit setting persists even after a patch is applied.

1) go to GI_HOME/crs/sbs

2) make a backup of crswrap.sh.sbs

3) in crswrap.sh.sbs, insert the followingline just before the line "# MEMLOCK limit is for Bug 9136459"

Ulimit-u 16384

Finally, although the above setting is successfully used to increase the numberof processes setting, please test this on the test server first before settingthe ulimit on the production.

Reference: http://blog.csdn.net/weiwangsisoftstone/article/details/42460585

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report