Understanding of rollforward and rollback of oracle instance recovery 07/01 Update SLTechnology News&Howtos

Understanding of rollforward and rollback of oracle instance recovery

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

There have always been some misunderstandings about the recovery of oracle instances. Today, by viewing the relevant materials and discussing with students, I found my own mistakes. The results are as follows:

Instance recovery: when the database is shut down abnormally (power outage or shu abort and other inconsistent shutdown), when you restart the database, the database-related processes automatically restore the instance without human intervention.

When do you need instance recovery?

Under shutdown normal or shutdown immediate, the so-called clean shutdown,checkpoint is also automatically triggered and the SCN record is written back. When checkpoint occurs, the SCN is written to four places:

There are three places in control file:

SYSTEM CHECKPOINT SCN

Datafile checkpoint SCN

Stop SCN: when instance consistency is turned off, update

One is in datafile header:

Start SCN

For a database that is consistent in the normal open state, the three SCN of SYSTEM CHECKPOINT SCN,Datafile checkpoint SCN and data file header Start SCN are consistent, and the stop scn stored in control file will be restored to null.

When Clean shutdown

When clean shutdown, checkpoint will proceed, and at this time, the stop scn of datafile and the start scn in the control file will be the same. When waiting for the open database, Oracle checks whether the start scn in datafile header and the scn of datafile stored in control file are the same. If so, then check whether start scn and stop scn are the same. If it is still the same, the database will open normally, otherwise recovery will be required.

When the database is opened, the stop scn stored in control file will return to null, which means that datafile is open in normal mode.

Abnormal shutdown

If the SHUTDOWN is not normal (shutdown abort), then after the mount database, you will find that stop scn is not equal to scn in other locations, but equal to NULL, which means that Oracle does not perform checkpoint during shutdown, and crash recovery (instance recovery) must be performed the next time you boot.

One thing to note:

When starting the database, if STOP SCN = NULL is found, it means that crash recovery is required

When starting the database, if you find that the START SCN of datafile header is not equal to the DATAFILE SCN stored in CONTROLFILE, it means that Media recovery is required.

Specific process of instance recovery

When the database crashes suddenly, and there is no time to flush the dirty data blocks in the buffer cache into the data file, and the running transaction is suddenly interrupted when the instance crashes, the transaction is in an intermediate state, that is, it is neither committed nor rolled back. At this point, the contents of the data file do not reflect the state of the instance when it crashes. The database closed in this way is inconsistent.

The next time the instance is started, Oracle will be automatically restored by the SMON process. When the instance starts, the SMON process checks the END SCN number of each online, readable and writable data file recorded in the control file.

During the normal operation of the database, the END SCN number is always NULL, and when the database is shut down normally, a full checkpoint will be carried out and the checkpoint SCN number will be updated to this field, so you can determine whether the instance recovery is needed by whether the END SCN number is null.

If the Oracle does not have time to update the field when it crashes, the field is still NULL. When the SMON process finds that the field is empty, it knows that the instance did not shut down properly last time, so the SMON process starts the instance recovery.

When the SMON process performs instance recovery, it obtains the checkpoint location from the control file. Therefore, the SMON process goes to the online log file, finds the checkpoint location, and then applies all the redo entries from that checkpoint location, thus restoring the state of the instance crash point in buffer cache. This process is called rollforward. After rolling forward, there are dirty blocks in buffer cache that were committed but not written to the data file at the time of the crash, and transactions are abruptly terminated, resulting in dirty blocks in transactions that are neither committed nor rolled back.

As soon as the roll forward is complete, the SMON process immediately opens the database. However, at this time, the database also contains those intermediate dirty blocks that are neither committed nor rolled back, which cannot exist in the database because they are not committed and must be rolled back. After opening the database, the SMON process rolls back in the background.

Sometimes, after the database is opened, a user process sends a request to read these blocks before the SMON process has time to roll back these intermediate blocks. At this time, the server process is responsible for rolling back these blocks before returning them to the user. After the rollback is complete, the contents of the data blocks are returned to the user.

Why is the instance recovery of the database rolled back before?

Rollback segments actually exist in the form of rollback tablespaces, and since they are tablespaces, there must be corresponding data files and image blocks in buffer cache, which is the same as data files in other tablespaces.

When a DML operation occurs, both REDO (REDO Entry for the DML operation itself) and UNDO (used to roll back the DML operation, recorded in the UNDO tablespace) are generated, but since the UNDO information is also stored in the rollback tablespace, the UNDO information corresponding to the DML operation (UNDO BLOCK in the BUFFER CACHE generation correspondence) will first generate its corresponding REDO information (UNDO BLOCK's REDO Entry) and write it to Log Buffer.

The reason for this is that the UNDO tablespace blocks in Buffer Cache may also be lost due to database failures. In order to ensure a smooth rollback at the next startup, we must first use the REDO log to recover the UNDO segment (in fact, it is to reply to the dirty data blocks in Buffer Cache first, and then write it to the UNDO segment by Checkpoint), and then use UNDO information to roll back after the database OPEN, to achieve consistency.

After the UNDO BLOCK's REDO Entry is generated, it is the REDO Entry corresponding to the DML statement. Finally, modify the Block in the Buffer Cache, and the Block becomes a dirty data block at the same time.

In fact, to put it simply, the purpose of REDO is to record all database changes, including UNDO tablespaces.

Total knot

The most important thing I know today is that rollforward is the application of redo to recover buffer cache data and restore buffer cache to its pre-crash state, so there are not only dirty blocks in buffer cache that were committed but not written to the data file at the time of crash, but also transactions that were abruptly terminated, resulting in dirty blocks of data that were neither committed nor rolled back (that is, no commit) But dbwr has flushed the changes to the underlying disk), another point is that there is an end scn in the control file, which is used to record the scn of the database file header when the database is normally shut down, and you can determine whether the scn is null or not.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.