Two hard drives of raid5 are offline, which is how to recover data from the database. 02/07 Update SLTechnology News&Howtos

Two hard drives of raid5 are offline, which is how to recover data from the database.

2026-02-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

[description of raid data recovery failure]

Huawei S5300 storage, storage for 16 FC hard drives, the entire storage space is composed of 450GB FC hard drives into a RAID5 disk array (including a hot spare). The RAID5 array No. 3 hard disk in the storage is offline due to unknown reasons, and the hot spare is online and synchronized. When the synchronization is completed to about 50%, disk 8 is offline due to unknown reasons, synchronization fails, the raid array is paralyzed, and the upper lun fails. There is an urgent need to restore the data of the raid array in the storage.

[raid data recovery procedure 1: detect all disks of the raid array]

First of all, all the disks in the raid (including normal disks and offline disks) are physically detected to determine whether there is a physical fault on the offline disk. The result is that disk 3 has a physical fault, and all other disks, including disk 8, are free from physical failure.

[raid data recovery process 2: backing up all disks of raid array]

After the physical inspection, use the dd command or data recovery tool to mirror all disks into file backups (the purpose of this is to perform data recovery operations in the mirror and protect the user's source data).

[raid data recovery process 3: analyze the raid structure of the array]

The engineer analyzed all the disks in the raid array and found out that the hot spare (the hot spare is theoretically obviously different from other data disks, which can be directly distinguished) because the raid is striped, the data in all the arrays are stored according to certain rules. So engineers analyze the distribution of database pages in raid in each physical disk and calculate the disk order, data direction, stripe size and other basic information of raid groups.

[raid data recovery process 4: find out the bad disk synchronized in the raid array]

According to the RAID information obtained from the analysis, we try to virtualize the original RAID group through the RAID virtual program. However, due to the loss of two disks in the whole RAID group and the data of one hard disk was damaged synchronously. After careful analysis of the data in each hard disk, it is found that the data of one hard disk on the same stripe is obviously different from that of other hard drives, so it is preliminarily judged that this hard disk may be damaged by synchronization, and the stripe can be checked by the RAID check program, so it can be clearly damaged by synchronization.

[raid data recovery process 4: analyze raid array lun information]

Analyze the lun information first simulate the state of the raid array, analyze the allocation state of lun in the array, analyze the data blocks allocated by lun, and then according to the data MAP and export the data of LUN.

[file system data recovery process: parsing EXT3 file system]

Because the EXT3 file system cannot be mounted normally because of the virtual RAID structure of the hot spare, we can only extract the oracle database file, parse the file system using the self-developed file system parser, export the oracle database file, and hand over the database file to the database engineer for checksum verification.

[database repair process 1: check data file integrity]

Use the Oracle database file detection tool to check whether each database file is complete and find errors. Then use the Oracle database detection tool (more stringent inspection), found that there are some database files and log files errors, system and sysaux table space each has more than 100 bad blocks; three control files have many bad blocks, all the control files are damaged; three files in eschoolspace table space have more bad blocks, up to 1000; undotbs02 is lost; database engineers repair such files

Figure 1:

Figure 2:

[database repair process 1: repair database]

We created the control file, created the undo tablespace, and started the database to mount. The bad block of the system data file prevents the database from open. Various implicit parameters can not bypass the bad blocks of system; build a database environment. Restore the database using the dmp file. Use the import after March 9, all reported an error, only about 10G of data can be imported.

Figure 3:

[data verification: data recovery successful]

With the cooperation of the user side, start the Oracle database and install the OA client in the local virtual machine. The data record is verified by the OA client, and the user arranges the personnel of different departments for remote verification. The verification passed and the data was recovered successfully

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.