In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
[fault handling] A RAC fault handling process
1.1 introduction project of fault environment
Source db
Db Typ
2-node RAC
Db version
11.2.0.1.0
Db storage
ASM
OS version and kernel version
RHEL 6.6
1.2 after 10: 00 p.m., a netizen asked me to help deal with the problem that RAC could not be started due to downtime, and told me about multipath and storage. Wheat seedlings do not know much about storage, do not have much contact with multi-paths, and have not studied this thing themselves. Now that you have found me, you can't ignore it and go up there to have a look. The result was miserable. I worked for N hours and asked for help from N people. I finally got it at noon the next day. Fortunately, the next day was the weekend and I didn't have to go to work. Wheat seedlings record the process of treatment. I hope my process can help more people.
At the beginning, the css of node 1 could not start, a lot of errors were reported, and the ha of node 2 could not start properly. Error I forgot to record, anyway, it is a variety of research logs, all kinds of check MOS, all kinds of Baidu, all kinds of Google, including OCR restore have tried, and finally there is no way, only to use personal commonly used tricks, that is. Re-execute the root.sh script.
I have mentioned the execution of this script many times in my personal blog. However, we still need to practice more, because there are a lot of points for attention. First, if you want to keep the disk group from being deleted, you can add the-keepdg option to the unmount command ($ORACLE_HOME/crs/install/rootcrs.pl-deconfig-force-verbose), but 11.2.0.1 does not. When uninstalling on the second node, you can retain as much information as possible without adding-lastnode.
Fortunately, after the first execution of wheat seedlings, the cluster can start normally, everything is fine, from 10:00 to 1 o'clock. As a result, when preparing to import the backup of OCR, you need to start CRS in exec mode, but the result is sad again, and the cluster is broken. There is no way, but to restart, restart more sad, OCR disk can not be found. Wheat seedlings want to give up. I can't find the disk, and I can't help it. We have to find someone who knows how to store it. It's almost two o'clock. Well, it's time to get some rest.
After 8: 00 in the morning, I quickly logged on teamviewer and continued to deal with it. First of all, we have been messing with multi-paths for half a day. It turned out that there was a problem with the multipath software of the second node, so I reinstalled it myself. I expect to see the disk after installation, but it still doesn't work. Helplessly, look for a master who understands storage in the group of leshami. Boss Xiao helped me look at the storage and found the disk. Thank you very much.
Then continue with the restore operation, continue with deconfig, and then root.sh. After the implementation of root.sh, I found that the cluster was normal, and I tried to restart the host. Everything was normal. It seems that the storage is messed up. Then continue to restore the database, this is the key point. As the whole operation is careful not to touch the non-OCR disk, for fear of losing the data, because there is no backup of 10T of data, I am also drunk. Use kfod to take a look at the disk, everything is fine, all right, then directly MOUNT the disk group. After re-executing root.sh, as long as the disk file of the disk group is not corrupted, it can be directly MOUNT up. This is also a way to restore OCR without backup.
Then everything went well, such as configuring monitoring, adding DB to the srvctl manager, etc., which is really blessed by Buddha. Many processing logs are not recorded, so only a few scripts can be given here.
1.2.1 some scripts used in the process to re-execute the root.sh script need to pay special attention to whether the database data is placed on the OCR disk group. If you put it on an OCR disk group, remember that you cannot execute the script at will.
1 and 2 nodes execute deconfig respectively:
Export ORACLE_HOME=/u01/app/11.2.0/grid
Export PATH=$PATH:$ORACLE_HOME/bin
$ORACLE_HOME/crs/install/rootcrs.pl-deconfig-force-verbose
2. After the execution, the OCR disk needs to be executed by dd,2 all nodes:
Dd if=/dev/zero of=/dev/oracleasm/disks/OCR_VOL2 bs=1024k count=1024
Dd if=/dev/zero of=/dev/oracleasm/disks/OCR_VOL1 bs=1024k count=1024
3. After the execution of node 1, it will be executed in node 2:
Export ORACLE_HOME=/u01/app/11.2.0/grid
$ORACLE_HOME/root.sh
In addition, there is a common bug error in executing root.sh for version 11.2.0.1:
CRS-4124: Oracle High Availability Services startup failed.
CRS-4000: Command Start failed, or completed with errors.
Ohasd failed to start: Inappropriate ioctl for device
Ohasd failed to start: Inappropriate ioctl for device at / u01/app/11.2.0/grid/crs/install/roothas.pl line 296.
The solution to this mistake is:
Is to execute the following command before executing root.sh
/ bin/dd if=/var/tmp/.oracle/npohasd of=/dev/null bs=1024 count=1
If it appears
/ bin/dd: opening` / var/tmp/.oracle/npohasd': No such file or directory
When the file indicates that the relevant file has not been generated, then continue to execute until it can be executed, and generally execute the dd command when the message Adding daemon to inittab appears.
1.2.2 some configurations of root.sh 's configuration script root.sh are placed in the following script, including the name of the OCR disk to be created, disk path, and so on:
$ORACLE_HOME/crs/config/config.sh
1.2.3 kfod command this command displays all disk information:
Data01- > export ORACLE_HOME=/u01/app/11.2.0/grid
Data01- > $ORACLE_HOME/bin/kfod disk=all s=true ds=true c=true
Disk Size Header Path Disk Group User Group
=
1: 476837 Mb MEMBER / dev/oracleasm/disks/DATA_VOL1 DATA grid asmadmin
2: 953674 Mb MEMBER / dev/oracleasm/disks/DATA_VOL10 DATA grid asmadmin
3: 953674 Mb MEMBER / dev/oracleasm/disks/DATA_VOL11 DATA grid asmadmin
4: 953675 Mb MEMBER / dev/oracleasm/disks/DATA_VOL12 DATA grid asmadmin
5: 953674 Mb MEMBER / dev/oracleasm/disks/DATA_VOL13 DATA grid asmadmin
6: 953674 Mb MEMBER / dev/oracleasm/disks/DATA_VOL14 DATA grid asmadmin
7: 953674 Mb MEMBER / dev/oracleasm/disks/DATA_VOL15 DATA grid asmadmin
8: 953674 Mb MEMBER / dev/oracleasm/disks/DATA_VOL16 DATA grid asmadmin
9: 953675 Mb MEMBER / dev/oracleasm/disks/DATA_VOL18 DATA grid asmadmin
10: 953675 Mb MEMBER / dev/oracleasm/disks/DATA_VOL2 DATA grid asmadmin
11: 953674 Mb MEMBER / dev/oracleasm/disks/DATA_VOL3 DATA grid asmadmin
12: 953674 Mb MEMBER / dev/oracleasm/disks/DATA_VOL4 DATA grid asmadmin
13: 953675 Mb MEMBER / dev/oracleasm/disks/DATA_VOL5 DATA grid asmadmin
14: 953674 Mb MEMBER / dev/oracleasm/disks/DATA_VOL6 DATA grid asmadmin
15: 953674 Mb MEMBER / dev/oracleasm/disks/DATA_VOL7 DATA grid asmadmin
16: 953674 Mb MEMBER / dev/oracleasm/disks/DATA_VOL8 DATA grid asmadmin
17: 953675 Mb MEMBER / dev/oracleasm/disks/DATA_VOL9 DATA grid asmadmin
18: 476837 Mb MEMBER / dev/oracleasm/disks/FLASH_VOL1 FLASH grid asmadmin
19: 286103 Mb MEMBER / dev/oracleasm/disks/FLASH_VOL2 FLASH grid asmadmin
20: 286057 Mb MEMBER / dev/oracleasm/disks/OCR_VOL1 OCR grid asmadmin
21: 286102 Mb CANDIDATE / dev/oracleasm/disks/OCR_VOL2 # grid asmadmin
22: 476837 Mb MEMBER ORCL:DATA_VOL1 DATA
23: 953674 Mb MEMBER ORCL:DATA_VOL10 DATA
24: 953674 Mb MEMBER ORCL:DATA_VOL11 DATA
25: 953675 Mb MEMBER ORCL:DATA_VOL12 DATA
26: 953674 Mb MEMBER ORCL:DATA_VOL13 DATA
27: 953674 Mb MEMBER ORCL:DATA_VOL14 DATA
28: 953674 Mb MEMBER ORCL:DATA_VOL15 DATA
29: 953674 Mb MEMBER ORCL:DATA_VOL16 DATA
30: 953675 Mb MEMBER ORCL:DATA_VOL18 DATA
31: 953675 Mb MEMBER ORCL:DATA_VOL2 DATA
32: 953674 Mb MEMBER ORCL:DATA_VOL3 DATA
33: 953674 Mb MEMBER ORCL:DATA_VOL4 DATA
34: 953675 Mb MEMBER ORCL:DATA_VOL5 DATA
35: 953674 Mb MEMBER ORCL:DATA_VOL6 DATA
36: 953674 Mb MEMBER ORCL:DATA_VOL7 DATA
37: 953674 Mb MEMBER ORCL:DATA_VOL8 DATA
38: 953675 Mb MEMBER ORCL:DATA_VOL9 DATA
39: 476837 Mb MEMBER ORCL:FLASH_VOL1 FLASH
40: 286103 Mb MEMBER ORCL:FLASH_VOL2 FLASH
41: 286057 Mb MEMBER ORCL:OCR_VOL1 OCR
42: 286102 Mb CANDIDATE ORCL:OCR_VOL2 #
ORACLE_SID ORACLE_HOME HOST_NAME
=
+ ASM1 / u01/app/11.2.0/grid data01
+ ASM2 / u01/app/11.2.0/grid data02
Data01- >
Data01- >
Data01- >
Data01- > sqlplus / as sysasm
SQL*Plus: Release 11.2.0.1.0 Production on Sat Dec 10 12:27:25 2016
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0-64bit Production
With the Real Application Clusters and Automatic Storage Management options
SQL >
SQL >
SQL > alter diskgroup OCR ADD DISK'/ dev/oracleasm/disks/OCR_VOL2'
Diskgroup altered.
1.2.4 add db to srvctl Manager 11.2.0.1 there is no-c parameter, so remove it. You can use-h to see the specific usage:
Srvctl add database-d DGPHY-c RAC-o / oracle/app/oracle/product/11.2.0/db-p'+ DATA/TESTDGPHY/PARAMETERFILE/spfiledgphy.ora'-r primary-n TESTDG
Srvctl add instance-d DGPHY-I DGPHY1-n ZFZHLHRDB1
Srvctl add instance-d DGPHY-I DGPHY2-n ZFZHLHRDB2
Srvctl status database-d DGPHY
Srvctl start database-d TESTDG
About Me
.
● author: wheat seedlings, only focus on the database technology, pay more attention to the application of technology
● article is updated synchronously on itpub (http://blog.itpub.net/26736162), blog Park (http://www.cnblogs.com/lhrbest) and personal Wechat official account (xiaomaimiaolhr).
● article itpub address: http://blog.itpub.net/26736162/viewspace-2130218/
● article blog park address: http://www.cnblogs.com/lhrbest/p/6157931.html
● pdf version of this article and wheat seedling cloud disk address: http://blog.itpub.net/26736162/viewspace-1624453/
● QQ group: 230161599 WeChat group: private chat
● contact me, please add QQ friend (642808185), indicate the reason for adding
● was completed in Taixing apartment from 22:00 on 2016-12-09 to 16:00 on 2016-12-10.
The content of the ● article comes from the study notes of wheat seedlings, and some of it is sorted out from the Internet. Please forgive me if there is any infringement or improper place.
Copyright ● all rights reserved, welcome to share this article, please reserve the source for reprint
.
The mobile phone captain clicks the image below to identify the QR code or the Wechat client scans the following QR code to follow the Wechat official account of wheat seedlings: xiaomaimiaolhr, and learn the most practical database technology for free.
Cdn.qqmail.com/zh_CN/htmledition/p_w_picpaths/function/qm_open/ico_mailme_02.png ">
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.