Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Treatment of RMAN-06054 problem of RMAN duplicate recovery Database error

2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Recently, if you want to make a big move in production, you need to restore the backup of the production library to another machine every day for testing. So thought of using the way of DUPLIDATE, simple and convenient, early configuration of the directory, and then a command can restore the library. So I wrote a recovery script, passed the test, and everything was normal in production. However, when you need to restore the database in the test environment, you use this script to report an error RMAN-06054. Oddly enough, the same backup has been successfully restored in another environment on production. Let's take a look at how to deal with this problem.

First, take a look at the picture of the error report:

From the point of view of the error report, it is necessary to find the archive with node 1 serial number 36615, but the current archive number of the library has reached more than 300,000, obviously looking for the archive from a long time ago. So go to MOS to find duplicate RMAN-06054-related articles, there are not many, and it is said that it is BUG, will not happen to encounter BUG again. However, this backup file has been successfully restored in other environments, so why is the recovery not successful in this environment? After a brief comparison of the logs in the two recovery processes, it is found that there is a difference between the incorrect recovery log and the successful log. The creating datafile log appears, which feels strange, but I don't know why. It turns out that this is a key point, and if you investigate directly from this point, you may soon find the problem, but it still took 2 days. The following figure shows the difference:

Let's continue to troubleshoot the problem. Since the DUPLICATE statement cannot automatically recover data by recover, what would be the effect of trying manual recover? see the following figure:

It seems that the manual recover still reported an error, looking for the archive log of sequence 36615. Recover, no. Try open reseglogs. I would like to advise you that this is a test environment you can try at will. If you are operating in a production environment, please be in awe of production to prevent the situation from getting worse. The result of open reseglogs is still an error:

Check the backup of the archived log in the backup file. The sequence is more than 300,000. There is no mistake in finding the 36615 archive. Then this is the end. How can it be restored?

What if the recovery is not successful? The test is still waiting for the library. Is it DUPLICATE's BUG? Or is the "posture" wrong? Redo it again, only to report the same error after waiting for two hours.

DUPLICATE recovery is not successful, then I use the way of spinning manual restore recover is not OK? The result is that the ideal is very plump, the reality is very bony, and it is still wrong. So what's the problem?

Calm down and think about it, recover database wants to find the archive log from a long time ago. Is there something wrong with the backup file, which leads to the problem with the recovered file? So use validate to verify the backup file again, and the result is no problem. Is there something wrong with the transmission process? The files on both sides of the machine are checked by md5, and the result is that the files on both sides are the same again. So what exactly is the problem?

It suddenly occurred to me that you can look up the scn number of the file by querying the data dictionary. Can you find the answer to the question through this? Let's take a look at the query results:

The scn number of the file found in v$datafile is several orders of magnitude larger than the scn number in the error report. Is that not the problem? Also thought that the v$datafile should be the information recorded in contral file, the control file is restored from the backup, it should be recorded is a relatively new scn number, how to find the actual scn number in the file, so I thought of v$datafile_header this data dictionary. Finally found some clues in this data dictionary:

From the above picture, you can see that the scn number of some files is much smaller than others, so it should be the data file with the problem. And compared with the file number 12, the scn number is 22575491 and the scn number in the RMAN error report is consistent. That should explain the problem. There is a problem with the recovery of some data files, resulting in the need for earlier archive logs for recovery, but the archive logs have been deleted and cannot be recovered, so recover cannot proceed.

If the problem is found, it would be good to re-restore the wrong file. The result is that the ideal is very plump, and the reality is very bony. There is a creating datafile statement in restore datafile, and like the problem found at the beginning, there is a problem with querying the v$datafile_header file again. Now that we have come to this point, how to solve the problem?

It is also possible to query backup files with datafile 12.

However, when trying to recover using FULLBACKUP's tag, a new error no backup of copy of datafile found to restore appears. This is strange, there is a backup check before, but not in the restore Times, is it a "ghost"?

I really couldn't think of a solution to the problem, so I went to check the successful log. This time I made a major discovery that the backup file of datafile 12 was in the backup file 20181218.

Now I understand that when transferring backup files, other colleagues thought that only 20181219 of the files were all backup files, while ignoring the 10 backup files of 20181218. And I used this backup with 10 files missing to try to recover the database many times. It's kind of funny to think about it, as I said at the beginning, if you had compared the recovery log in detail when you found something abnormal in the recovery log at first, you wouldn't have spent so much time trying to recover without all the backup files.

After retransferring the missing backup files, duplicate successfully completed the recovery.

To solve the problem, finally remind yourself that you need to be more careful in doing things. And the most important thing is awe of production.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report