Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Example Analysis of NBU backup error

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article will explain in detail the example analysis of NBU backup errors. The editor thinks it is very practical, so I share it with you for reference. I hope you can get something after reading this article.

During a routine check of the system, it was found that the daily backup failed.

The error message is:

RMAN > backup incremental level 0 database

Starting backup at 10-MAR-08

Using target database controlfile instead of recovery catalog

Allocated channel: ORA_SBT_TAPE_1

Channel ORA_SBT_TAPE_1: sid=120 devtype=SBT_TAPE

Channel ORA_SBT_TAPE_1: VERITAS NetBackup for Oracle-Release 5.0GA (2003103006)

Channel ORA_SBT_TAPE_1: starting incremental level 0 datafile backupset

Channel ORA_SBT_TAPE_1: specifying datafile (s) in backupset

Input datafile fno=00001 name=/dev/vx/rdsk/maindbdg/lv_main00

Input datafile fno=00008 name=/opt/oracle/oradata/oradata/bjdb01/users01.dbf

Input datafile fno=00039 name=/opt/oracle/oradata/oradata/bjdb01/xdb02.dbf

Input datafile fno=00009 name=/opt/oracle/oradata/oradata/bjdb01/xdb01.dbf

Input datafile fno=00003 name=/opt/oracle/oradata/oradata/bjdb01/cwmlite01.dbf

Input datafile fno=00004 name=/opt/oracle/oradata/oradata/bjdb01/drsys01.dbf

Input datafile fno=00006 name=/opt/oracle/oradata/oradata/bjdb01/odm01.dbf

Input datafile fno=00007 name=/opt/oracle/oradata/oradata/bjdb01/tools01.dbf

Channel ORA_SBT_TAPE_1: starting piece 1 at 10-MAR-08

RMAN-00571: =

RMAN-00569: = ERROR MESSAGE STACK FOLLOWS =

RMAN-00571: =

RMAN-03009: failure of backup command on ORA_SBT_TAPE_1 channel at 03/10/2008 11:31:12

ORA-19506: failed to create sequential file, name= "tpjatl1b_1_1", parms= ""

ORA-27028: skgfqcre: sbtbackup returned error

ORA-19511: Error received from media manager layer, error text:

VxBSACreateObject: Failed with error:

Server Status: unable to allocate new media for backup, storage unit has none available

Judging from this error message, it seems to be caused by a lack of space. However, although the backup error message changes to:

RMAN-00571: =

RMAN-00569: = ERROR MESSAGE STACK FOLLOWS =

RMAN-00571: =

RMAN-03009: failure of backup command on ch00 channel at 03/10/2008 05:14:15

ORA-19502: write error on file "bk_26552_1_648968690", blockno 664577 (blocksize=512)

ORA-27030: skgfwrt: sbtwrite2 returned error

ORA-19511: Error received from media manager layer, error text:

VxBSASendData: Failed with error:

Server Status: Communication with the server has not been iniatated or the server status has not been retrieved from the server.

Judging from this mistake, it is not just a matter of space.

Through the graphical interface jnbSA, it is found that many management options respond very slowly after clicking, and basically can not produce results. So bpadm is used to query from the command line, and the following information is queried from the PROBLEM of REPORT:

03/11/2008 01:45:04 backupcenter240 bpexpdate Could not build host list: client hostname could not be found

03Universe 11 backupcenter240 bjdb01 cannot write p_w_picpath to media id 02:13:34 2008 backupcenter240 bjdb01 cannot write p_w_picpath to media id 000013, drive index 0, I Charpy O error

03/11/2008 02:13:48 backupcenter240 bjdb01 backup by oracle on client bjdb01 using policy oracle: media write error

03thumb 11 backupcenter240 bjdb01 backup of client bjdb01 exited with status 02:14:04 2008 (the backup failed to backup the requested files)

03Universe 11 backupcenter240 bjdb01 cannot write p_w_picpath to media id 02:22:58 2008 backupcenter240 bjdb01 cannot write p_w_picpath to media id 000013, drive index 0, I Charpy O error

03/11/2008 02:23:12 backupcenter240 bjdb01 backup by oracle on client bjdb01 using policy oracle: media write error

03/11/2008 02:23:19 backupcenter240 bjdb01 suspending further backup attempts for client bjdb01, policy oracle, schedule Cumulative-Inc because it has exceeded the configured number of tries

03thumb 11 backupcenter240 bjdb01 backup of client bjdb01 exited with status 02:23:19 2008 (the backup failed to backup the requested files)

02:23:20 backupcenter240-scheduler exiting-the backup failed to backup the requested files (6)

03Universe 11 backupcenter240 data03 cannot write p_w_picpath to media id 09:32:42 2008 backupcenter240 data03 cannot write p_w_picpath to media id 000016, drive index 0, I Charpy O error

03ax 11 backupcenter240 data03 DOWN'ing drive index 09:32:53 backupcenter240 data03 DOWN'ing drive index 0, it has had at least 3 errors in last 12 hour (s)

03/11/2008 09:32:55 backupcenter240 data03 backup by oracle on client data03 using policy bjdb03-ora: media write error

03thumb 11 backupcenter240 data03 backup of client data03 exited with status 09:33:02 2008 (the backup failed to backup the requested files)

03/11/2008 10:48:34 backupcenter240 data03 media manager terminated during mount of media id 000016, possible media mount timeout

03/11/2008 10:48:36 backupcenter240 data03 media manager terminated by parent process

03/11/2008 10:48:37 backupcenter240 data03 backup by oracle on client data03 using policy bjdb03-ora: the backup failed to backup the requested files

03/11/2008 10:48:38 backupcenter240 data03 suspending further backup attempts for client data03, policy bjdb03-ora, schedule diff because it has exceeded the configured number of tries

03thumb 11 backupcenter240 data03 backup of client data03 exited with status 10:48:38 2008 (the backup failed to backup the requested files)

03/11/2008 13:55:03 backupcenter240 bpexpdate Could not build host list: client hostname could not be found

Further query the detailed log information and find that there are a large number of errors:

03Universe 11 backupcenter240 18:23:59 2008 backupcenter240-cleaning job DB

18:23:59 backupcenter240-all drives are down for the specified robot number = 0, robot type = TLD and density = hcart

03Universe 11 backupcenter240 18:23:59 2008 backupcenter240-no drives up on storage unit

18:24:00 bjdb01-all drives are down for the specified robot number = 0, robot type = TLD and density = hcart

03Universe 11 backupcenter240 18:24:00 2008 backupcenter240-no drives up on storage unit

18:24:31 backupcenter240-all drives are down for the specified robot number = 0, robot type = TLD and density = hcart

03Universe 11 backupcenter240 18:24:31 2008 backupcenter240-no drives up on storage unit

18:24:32 backupcenter240-all drives are down for the specified robot number = 0, robot type = TLD and density = hcart

03Universe 11 backupcenter240 18:24:32 2008 backupcenter240-no drives up on storage unit

03/11/2008 18:24:32 backupcenter240 data03 skipping backup of client data03, policy bjdb03-ora, schedule diff because it has exceeded the configured number of tries

From this information, it seems that there is something wrong with the manipulator. And if it is really the problem with the robot, it can also explain the difference in error messages between the two backups. When a tape backup is full, the robot tries to replace the new tape, which fails, and for the backup operation at that time, there is an error that cannot be written, and there is not enough room to report the error. The subsequent backup failed due to robot failure, resulting in no available tapes to write to, so the error NETBACKUP was not initialized.

Continue to review the media report and see in the summary information:

Number of ACTIVE media that, as of now:

There are no ACTIVE media present in the media database

This further confirms the judgment that the robot failure caused the available tapes not to be placed in the drive, so there is no available media in the system.

Check the status of the manipulator through tpconfig:

Index DriveName DrivePath Type Shared Status

* *

0 IBMULTRIUM-TD10 / dev/rmt/1cbn hcart Yes DOWN

TLD (0) Definition DRIVE=1

Currently defined robotics are:

TLD (0) robotic path = / dev/sg/c2t4l1

Volume database host = backupcenter240

The manipulator is in the state of DOWN, and it seems that the problem has been basically determined.

Try to check the robot using robtest:

Bash-2.03# robtest

Configured robots with local control supporting test utilities:

TLD (0) robotic path = / dev/sg/c2t4l1

Robot Selection

-

1) TLD 0

2) none/quit

Enter choice: 1

Robot selected: TLD (0) robotic path = / dev/sg/c2t4l1

Invoking robotic test utility:

/ usr/openv/volmgr/bin/tldtest-r / dev/sg/c2t4l1-D1 / dev/rmt/1cbn

Opening / dev/sg/c2t4l1

MODE_SENSE complete

Enter tld commands (? Returns help information)

?

To exit the utility, type q or Q.

Init-Initialize element status

Initrange [#]-Init element status range

Allow-Allow media removal

Prevent-Prevent media removal

Extend-Extend media access port

Retract-Retract media access port

Mode-Mode sense

M-Move medium

Pos-Position to drive or slot

S [d | p | t | s [n]] [raw]-Read element status

Inquiry-Display vendor and product ID

Rezero-Rezero unit

Inport-Ready inport (media access port)

Debug-Toggle debug mode for this utility

Test_ready-Send a TEST UNIT READY to the device

Specifies drive (d#), slot (s#), media access port (p#)

Or transport (t #)

Is drive #, slot #, media access port #, or transport #

[#] is number of elements for d, s, p, or t

NOTE-drive # is 1-Number of drives

Slot # is 1-Number of slots

Media access port # is 1-Number of media access port elements

Transport # is 1-Number of transports

= (d) rive, (s) lot, media access (p) ort, or (t) ransport

Unload-Issue SCSI unload

= D1 or 1, D2 or 2, D3 or 3. D648 or 648

Inquiry

Inquiry_data: STK L40 0213

Test_ready

Unit is ready

Q

Robot Selection

-

1) TLD 0

2) none/quit

Enter choice:

Try to issue the test_ready command, and after waiting for a period of time, it is found that the robot has returned to normal:

Index DriveName DrivePath Type Shared Status

* *

0 IBMULTRIUM-TD10 / dev/rmt/1cbn hcart Yes UP

TLD (0) Definition DRIVE=1

Currently defined robotics are:

TLD (0) robotic path = / dev/sg/c2t4l1

Volume database host = backupcenter240

The following attempts to back up:

$rman target /

Recovery Manager: Release 9.2.0.4.0-64bit Production

Copyright (c) 1995, 2002, Oracle Corporation. All rights reserved.

Connected to target database: BJDB01 (DBID=3255963758)

RMAN > backup current controlfile

Starting backup at 11-MAR-08

Using target database controlfile instead of recovery catalog

Allocated channel: ORA_SBT_TAPE_1

Channel ORA_SBT_TAPE_1: sid=19 devtype=SBT_TAPE

Channel ORA_SBT_TAPE_1: VERITAS NetBackup for Oracle-Release 5.0GA (2003103006)

Channel ORA_SBT_TAPE_1: starting full datafile backupset

Channel ORA_SBT_TAPE_1: specifying datafile (s) in backupset

Including current controlfile in backupset

Channel ORA_SBT_TAPE_1: starting piece 1 at 11-MAR-08

Channel ORA_SBT_TAPE_1: finished piece 1 at 11-MAR-08

Piece handle=ttjb17ur_1_1 comment=API Version 2.0,MMS Version 5.0.0.0

Channel ORA_SBT_TAPE_1: backup set complete, elapsed time: 00:04:56

Finished backup at 11-MAR-08

Starting Control File Autobackup at 11-MAR-08

Piece handle=c-3255963758-20080311-00 comment=API Version 2.0 MMS Version 5.0.0.0

Finished Control File Autobackup at 11-MAR-08

The backup attempt was finally successful.

Unfortunately, there seems to be no problem with backing up small files, and once the backup file is large, the above error message still appears:

RMAN-00571: =

RMAN-00569: = ERROR MESSAGE STACK FOLLOWS =

RMAN-00571: =

RMAN-03009: failure of backup command on ch00 channel at 03/10/2008 05:14:15

ORA-19502: write error on file "bk_26552_1_648968690", blockno 664577 (blocksize=512)

ORA-27030: skgfwrt: sbtwrite2 returned error

ORA-19511: Error received from media manager layer, error text:

VxBSASendData: Failed with error:

Server Status: Communication with the server has not been iniatated or the server status has not been retrieved from the server.

And a large number of IO error messages appear in the backend log:

03Blue12Universe 09:42:51 backupcenter240 bjdb01 cannot write p_w_picpath to media id 000016, drive index 0, I pictureO error

03Universe 09:42:51 backupcenter240 bjdb01 FREEZING media id 000016, it has had at least 3 errors in the last 12 hour (s)

03Universe 12 backupcenter240 bjdb01 CLIENT bjdb01 POLICY oracle SCHED Default-Application-Backup EXIT STATUS at 09:43:08 2008 84 (media write error)

03/12/2008 09:43:08 backupcenter240 bjdb01 backup by oracle on client bjdb01: media write error

It seems that now it is not just a software problem, after the final confirmation of the supplier, it is the read and write head with the library that has a problem, and finally solves this problem by replacing parts.

This is the end of this article on "sample Analysis of NBU backup errors". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report