Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Storage of server data recovery method raid hard disk offline data recovery case

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/02 Report--

[fault description]

The bottom layer of a HP-P4500 storage system in a court is the RAID of a hard disk group of 12 1TB. Among them, there is a set of 6 1TB disks, the first part of the group has a RAID0+1, which is to store the HP-P4500 embedded system, then a RAID5 is used to store data, and the second group is a RAID5. There are two volumes in the upper layer of the storage system, one is 3TB and the other is 5TB. Later, due to disk failure, the storage is not available, the customer first asked the HP project to replace the disk, forced to go online, but the storage is still not available. Finally contact us for data recovery.

[hardware detection]

Our hardware engineer first did a hardware test on the customer's 12 hard drives and found that all the customer's hard drives were normal. Both troubleshooting hard disk hardware problems. Since everything was normal, we mirrored all 12 hard drives.

[fault analysis]

We use professional tools to make a detailed analysis of the backup image and find that the underlying RAID is a HP double-loop RAID5. And the first group of RAID is good, that is, the damage of the second group of RAID leads to the unavailability of the upper volume of the storage, and the second RAID is also a RAID5. If one of the hard disks is dropped, then the storage principle of RAID5 should not cause the storage to be unavailable. Therefore, it can be judged that at least two disks have been dropped in the second group of RAID, one of which is offline long ago, and the data in it is all old. We need to find the disk that has long been offline. However, through the hardware test, we found that all the hard drives have no hardware failure, so how can we tell which one is offline?

[solution]

Since I don't know which hard drive in RAID is offline early, there is no way to reorganize RAID. After careful consideration, it is determined that there are two feasible schemes.

Solution 1: exhaustive method, that is, assuming that one of the disks is offline a long time ago, kick out the disk, reorganize RAID and then generate all the data, and finally mount the data to HP-P4500 to see if the data is correct. If the data is incorrect, then assume that the other disk is offline and cycle. Although this scheme is feasible, the data generated by each reorganization of RAID takes too long and the accuracy is very low.

Solution 2: exhaustive check, or the same as the exhaustive method, assume that a disk is offline, kick out the disk and reorganize the RAID, but not generate all the data, but only generate the previous 5G data, because the index bitmap of the data stored in HP-P4500 is within the first few gigabytes of RAID (because we have studied the internal storage principle of HP-P4500 before). We only need to check whether the bitmap information of the index table is correct to determine whether the RAID is correct. If correct, then generating the data for this RAID can complete the reorganization of the RAID.

[implementation plan]

With the second solution, the correct RAID was quickly determined after several tests. Generate data for this RAID overnight. After the data is generated, mount the generated data to the HP-P4500 together with the first set of intact RAID. Then start the storage, and the upper volume becomes available from unavailable. I checked the latest files and found that everything was fine.

[data recovery succeeded]

Because the upper volume can be used directly, so the data is visible, but for security reasons, we still copy all the files in the volume and hand them over to the customer. After a long period of low-level analysis, coupled with constant testing. Finally, the data will be restored within the time required by the user. The whole recovery process took two days. The reason why we can recover so quickly is that we have studied the storage principle of HP-P4500 before. Once you know how HP-P4500 is stored, all data disasters about it can be recovered.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report