How to analyze the main events in AWR 04/19 Update SLTechnology News&Howtos

How to analyze the main events in AWR

2025-04-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article shows you how to analyze the major events in AWR, which is concise and easy to understand, which will definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.

Many DBA now read and analyze the STATSPACK/AWR report, but this report is not easy to read. The average DBA can at most look at the front TOP EVENTS, the hit rate, and some of the ADVISORY behind, but in fact, more secrets of the system exist in the InstanceActivity Stats that looks very raw. But for these STATS, most DBA feel very helpless, there is no official information to disclose the meaning of these STATS, some DBA after years of accumulation, can literally interpret some indicators. Over the years, Lao Bai has read thousands of STATSPACK/AWR reports. After a long period of accumulation, Lao Bai has summed up some experience in reading STATS, which will be shared with you in this section.

It is not normal for DBA to ask Lao Bai how big the index is. In fact, because the hardware configuration, application and periodic rhythm of each system are different, most of the database indicators do not have a "normal value". Of course, for those old DBAs who have come into contact with a large number of systems, they can see that some indicators are not normal at first glance, but for most DBA, seeing these indicators is like reading a book of heaven. Even if they know the meaning of these indicators, they cannot use these indicators to analyze database problems. Here, Lao Bai can tell you a very useful method.

This method is actually very simple, which is the method of baseline comparison that Lao Bai often says. For a system, the longer you take, the more you understand the temper of the system, the better you will be able to deal with problems with the system. And a new system, many DBA may feel as if there is no way to start, it is difficult to figure it out. What is the reason for this? In fact, a system that you have a lot of contact with, you have established a lot of baseline indicators for the system in your mind, so that if something goes wrong with the system, you can easily find the problem by comparing it with the usual. In fact, you have used the baseline comparison method mentioned by Lao Bai in your actual work. The basis of the baseline comparison method is to collect the information of a system for a period of time, establish a baseline of its important system indicators, and then compare the system state data with the baseline data when something goes wrong, so as to find the problem. When DBA analyzes AWR reports, the best way is to compare the average, reasonably high and low values of these indicators in the system, rather than looking for problems in a single report in isolation.

Another point to note is that the indicators in the AWR report are interrelated. When analyzing these indicators, you need to make a comprehensive analysis and compare these indicators with other data in order to get more accurate analysis results. For example, if you see from some indicators that there may be problems with DB CACHE, then you need to compare the DB CACHE hit rate in the report head. As well as the average waiting time in the indicators such as db file sequential read and db file scattered read in the event details, if the hit rate of DB CACHE is high, but the average waiting time of db file sequential read is large, then it cannot be said that there is no problem with DB CACHE. We can continue to analyze whether the configuration of BUFFER CACHE is unreasonable and how to optimize it through the following information such as Buffer Pool Statistics.

The attached table is a description of the main indicators in the AWR/STATSPACK report, which can be used as a reference for DBA. It is not necessary to study every index carefully or recite the contents of this table. Most of the contents in this table come from Lao Bai's study of the STATSPACK report over the years, and it does not come from the official statement, so part of the description may not be very accurate. If you have any questions about Lao Bai's explanation of these indicators, welcome to www.oraclefans.cn and discuss with Lao Bai.

Name

Annotation

CPU used by this session

Count the number of CPU time slices from the beginning to the end of the CALL. Each count represents a CPU cycle, that is, 10 milliseconds. However, if a CALL is executed in less than one CPU cycle, then the statistical value of START TIME and END TIME are the same, which will be calculated as 0. 5%. This count basically represents the CPU resources consumed by the ORACLE database, but note that the unit of calculation is cs, multiplied by 10 can be converted into milliseconds. For example, the average metric value per second is 782.1, which means that ORACLE consumed 7821 milliseconds of CPU time. If the system is a system with 16 CPU, then this indicator can indicate that ORACLE consumes more than 50% of CPU resources. However, since some small CALL may not be counted because the time consumed is less than 10 milliseconds, the actual utilization may be slightly higher than that calculated by this value. Generally speaking, most CALL consume more than 10 milliseconds of CPU, so this value can basically reflect the cost of ORACLE to CPU resources.

CR blocks created

CURRENT blocks are cloned and used to create CR (consistent read) blocks. The main reason for being cloned is that BUFFER is occupied by incompatible schemas. If the CR BLOCKS CREATED value per unit time is relatively high, it means that some data blocks are modified and accessed frequently in the database. If these visits are concentrated on certain hot blocks, it may form a serious BUFFER BUSY WAITS, and in a RAC environment, it may also lead to global hot block conflicts. If this indicator is high, you should pay attention to BUFFER BUSY WAITS, CACHE BUFFER CHAINS latches, etc.

Current blocks converted for CR

A BUFFER of CURRENT generates CR before it is used

DBWR buffers scanned

When certain trigger conditions occur, DBWR will start scanning dirty blocks at the cold end of the LRU chain to form DBWR BATCH. This indicator counts the total number of BUFFER scanned by DBWR on LRU, including dirty blocks and clean blocks. Dividing this indicator by DBWR LRU SCANS is the number of data blocks searched per scan.

DBWR checkpoint buffers written

The number of dirty blocks written by dbwr during CHECKPOINTS. If this index is relatively high per unit time, it means that data blocks change frequently in the system.

DBWR free buffers found

The number of BUFFER found by FREE when DBWR scans BUFFER from the LRU chain, divided by DBWR MAKE FREE REQUESTS, is the average number of FREE BUFFER found by scanning the LRU chain each time DBWR receives a DBWR MAKE FREE message, which is generally less.

DBWR make free requests

The number of MAKE FREE messages received by DBWR, if a foreground process cannot find it while looking for an idle BUFFER or some other trigger condition, it will send a MAKE FREE message to DBWR. If this value is high in unit time, it indicates that DB CACHE may be insufficient.

DBWR summed scan depth

When DBWR scans the LRU chain for dirty blocks, the number of BUFFER it looks for. The higher this number, the less the number of stolen blocks at the end of the LRU chain. Starting from Oracle 8i, because the algorithm of LRU chain has changed, if there are more hot blocks at the end of LRU chain, this index may also be larger.

DBWR timeout

If the DBWR IDLE exceeds a specific value, the indicator will be incremented by 1. If the value is high, it means that the data in BUFFER CACHE changes little and the number of dirty blocks that need to be written to disk is very small.

DBWR transaction table writes

The number of rollback segment headers written by DBWR, a high indicator indicating that more hot blocks are being written, while a large number of user processes are waiting for these blocks to be written

DBWR undo block writes

Number of blocks written by DBWR to the rollback segment

DDL statements parallelized

Count of concurrent execution of DDL operations

DML statements parallelized

Count of concurrent execution of DML operations

Background checkpoints completed

The number of CHECKPOINTS completed by the background process.

Background checkpoints started

The number of CHECKPOINTS started by the background process may be higher than the value of the previous state. If a new CHECKPOINT overrides an unfinished CHECKPOINT or CHECKPOINT is still being executed. This state contains only the CHECKPOINT of REDO, not other types of CHECKPOINT, such as OFFLINE files or BEGIN BACKUP or ALTER SYSTEM CHECKPOINT LOCAL commands.

Branch node splits

Due to the number of index branch node splits caused by inserting data, the higher index indicates that there are more branch node splits in the index at present, and it is possible that a certain index field on a table changes very frequently. This frequent change may have a greater impact on the performance of an application.

BUFFER DEADLOCK

Count the number of DB CACHE deadlocks. If the index is high per unit time, there may be performance problems in DB CACHE or some BUG.

Buffer is not pinned count

When accessing a BUFFER, the number of BUFFER that has been released is only used for Oracle internal debugging and does not indicate a performance problem.

Buffer is pinned count

When visiting a BUFFER, the BUFFER is already occupied by the PIN. If this index is relatively high per unit time, it indicates that there may be hot blocks.

Change write time

The time at which the current block change is written to the REDO (in cs,10 milliseconds). This indicator is generally not too large. If it is too large, it needs to be analyzed.

Commit cleanout failures: block lost

Count the number of times that the correct block cannot be found when you are ready to do a block cleanup operation in commit.

Commit cleanout failures: buffer being written

When Oracle is in COMMIT, when it clears the BUFFER, it is found that the BUFFER is being written by other sessions. If the index is high, it may indicate the existence of hot blocks.

Commit cleanout failures: callback failure

When Oracle is in COMMIT, call the CALLBACK function to return the count of FALSE when doing the clear operation.

Commit cleanout failures: cannot pin

When Oracle is in COMMIT, it is impossible to PIN the count of this BUFFER when performing the cleanup operation. It is possible that the BUFFER was held by other session PIN when preparing to clean up. If this index is high, you can check the DB CACHE-related situation, including DB CACHE size, hit rate, related latch hit rate, and hot block contention.

Commit cleanout failures: hot backup in progress

When Oracle is in COMMIT, he discovers that he is doing a hot backup when he is doing a clean operation. At this time, before modifying the BUFFER, the BUFFER must be written to the LOG BUFFER before it is modified, to ensure that the block fracture will not occur when the database is restored.

Commit cleanout failures: write disabled

When Oracle was in COMMIT, he found that the write operation of the database was temporarily turned off when he was doing the purge operation. This situation is rare.

Commit cleanouts

The count of block cleanup when doing COMMIT, whether it is successful or not, the counter will be incremented by 1.

Commit cleanouts successfully completed

The count of CLEANOUTS completed successfully when COMMIT. According to the reference of this index and the previous index, the two indicators should be relatively close (this index is slightly lower). If the difference between the two indicators is too large, it is necessary to analyze whether there is a problem that DB CACHE is too small, or whether large tables are often modified in application.

Consistent changes

Data blocks submit the count that UNDO information becomes CR blocks. This indicator indicates the number of CR blocks generated in the system. The larger this indicator, the more attention should be paid to latches such as CACHE BUFFER CHAINS, and the impact of hot blocks on system performance.

Consistent gets

The count of consistent reads, a request made by a session for a consistent read to a block of data. This indicator should not be confused with consistent changes. After a CR block is generated, it may be used by consistent gets multiple times, so this indicator is much larger than the previous one.

Data blocks consistent reads-undo records applied

Read data from UNDO to form CR READ. This counter records the number of UNDO records obtained from UNDO. If the value of this index is large, it means that queries and other operations for some tables that are modified frequently are also frequent, and there may be hot tables and indexes.

Deferred (CURRENT) block cleanout applications

Do a count of delayed block cleanup operations. When the data block is submitted, due to some reasons, some data blocks cannot do the block cleanup work immediately. In this case, the data block will do delayed block cleanup, which may be done the next time the data block is queried. This situation also causes us to sometimes see that we also generate a lot of REDO when we do SELECT operations.

Dirty buffers inspected

When a session starts to look for free data blocks at the cold end of the LRU chain and finds a dirty block, this indicator will be increased. If the index is larger per unit time, it means that there are more dirty blocks at the cold end of the LRU chain. There are several possibilities for this to happen:

L the number of stolen blocks in the system is very large, and the write speed of DBWR is insufficient, so that DBWR cannot write these dirty blocks to the hard disk as soon as possible.

L part BUFFER is very hot and is changed very frequently, resulting in a large number of such blocks at the end of the LRU chain.

L this system is a DML-based system, and the data blocks change frequently.

In this case, you can pay attention to the performance of dbwr, as well as the hit rate of DB CACHE and latches such as cache buffer chains.

Enqueue waits

Wait for the count of various locks

Exchange deadlocks

The count of internal deadlocks that occur when two BUFFER exchanges are made. Index scanning is the only factor leading to this exchange. If the index is high, you can check whether there is a very hot index (which can be located by BUFFER BUSY WAITS analysis).

Free buffer inspected

The number of BUFFER skipped when scanning reusable BUFFER from the tail of the LRU queue.

Global cache freelist waits

The wait caused by all lock element being used when ping a buffer.

Global lock convert time

Synchronous global lock conversion time (in 10ms). A higher indicator indicates that the global lock conflict is more serious, and the performance of cluster interconnect needs to be checked.

Hot buffers moved to head of LRU

When a hot block reaches the end of the LRU queue, ORACLE automatically moves the block to the head of the LRU queue so that it can continue to be used. Every time such an operation occurs, the count is incremented by one, and it is worth noting that LRU's algorithm has changed since 8i, introducing the TCH count to determine the hot block, rather than by moving the hot block on the LRU chain to ensure that the hot block is not swapped out prematurely. If the hot block exists at the tail of the LRU chain, if the hot block is found during the scan, it will be skipped actively, thus ensuring that the hot block will not be reused prematurely.

Immediate (CURRENT) block cleanout applications

As soon as the BUFFER (GET BUFFER) is obtained, the count of the record cleanup operation is performed.

Leaf node splits

When INSERT occurs, the number of times that the index leaf node splits, generally speaking, for systems with more frequent insertions, this index is generally higher, unless there is a serious phenomenon of hot blocks of leaf nodes, it is generally not necessary to pay special attention to this index.

Logons current

Count of the current login database

Opened cursors current

Current number of Open CURSORS

Opens of replaced files

The count of files reopened because they are not in the open file buffer. Each Oracle session has a file open buffer to keep a handle to a partially open data file to avoid repeatedly opening the file

Parse count (hard)

The statistical value of hard analysis, which needs to be compared with parse count (total) to see the proportion of hard analysis.

Parse count (soft)

Statistics of soft analysis

Parse count (total)

Total number of parse call that occurred

Parse time cpu

Statistics of CPU consumed by PARSE (in 10 milliseconds)

Parse time elapsed

The duration of the PASE, in 10 milliseconds. This metric minus parse time cpu is the waiting time in parse. If the proportion of parse time cpu to total parse time elapsed is low, it means that the waiting time in parse is too long, and there may be performance problems in the shared pool, which need to be analyzed.

Physical reads direct

The number of direct physical reads. Read without going through BUFFER. Generally, this kind of operation occurs: sort operation, SLAVE of parallel query operation, or pre-read.

Physical writes direct

The number of direct writes, without going through BUFFER. Generally speaking, this operation occurs in the following situations:

L direct mount operation, such as CREATE TABLE AS SELECT

L parallel DML operation

Temporary tablespace writes in sort operations

L write unbuffered LOB fields

Physical writes non checkpoint

Physical writes that are not caused by CHECKPOINT. The occurrence of physical writes includes CHECKPOINT or there is not enough free BUFFER available, or DBWR timeout, and so on. In general, this indicator will exceed half of physical writes, unless it is a system with very frequent CHECKPOINT. If the proportion of this index in physical writes is relatively small, it should be analyzed.

Pinned buffers inspected

When a user process scans the REPLACEMENT list for a reusable BUFFER, it finds that a cold block has been PIN or there is a wait event for a PIN request. This rarely happens because cold blocks are rarely PIN. If the average value of this indicator per second is large, it needs to be analyzed.

Queries parallelized

Statistics on the number of parallel query execution

Recursive cpu usage

CUP time for non-user calls

Recovery array read time

The time consumed by the IO generated when doing recovery

Recovery array reads

The number of IO generated when doing RECOVERY

Recovery blocks read

The number of data blocks read when doing recovery

Redo entries

This count increases when a redo message is copied to log buffer.

Redo entries linearized

The number of redo entries less than or equal to REDO_ENTRY_PREBUILD_THRESHOLD, which can be generated concurrently without being restricted by latches, but will increase the consumption of CPU. In a multi-CPU system, this value is higher.

Redo buffer allocation retries

The number of retries when applying for REDO BUFFER. Generally speaking, the retry occurs because the REDO WRITER has not been completed or the log switch is in progress. If this indicator is high, it means that the frequency of REDO LOG generation is very high, and LGWR cannot refresh LOG BUFFER in time. You can consider increasing the size of LOG BUFFER. LOG BUFFER can generally be a few megabytes to tens of megabytes, but because BUG exists in many database versions, it is not recommended to set the LOG BUFFER too large. Generally speaking, 30-40m is sufficient for most systems.

If it is due to waiting for a log switch, then the problems that may exist include:

The l REDO LOG file is too small

Poor performance of l REDO LOG IO

Poor IO performance of data files leads to slow DBWR writes

L the number of DBWR is too small, resulting in insufficient write performance of DBWR

Redo log space requests

The current ACTIVE log file is full, and Oracle must wait for the log switch to complete before allocating REDO LOG disk space. If the size of the SGA does not match the size of the log file, and the COMMIT in the system is very frequent. When the log is switched, DBWR needs to write the submitted dirty blocks to disk, and the log switch cannot be completed until these dirty blocks are written to disk. In this case, the statistical value will be relatively high.

If this statistical value is high, it is recommended to check the following aspects:

Is there a problem with the IO performance of l REDO LOG files? if the average time of log file parallel write is large, it means that there is a problem with the write performance of REDO LOG. It is recommended to put REDO LOG on a disk with better performance, or store REDO LOG files independently to avoid IO conflicts.

Is the size of l LOG BUFFER too small?

Submission in application software may be too frequent. Batch submission is recommended instead of submitting every record.

Redo log space wait time

The waiting time of Redo log space requests (in centiseconds). This metric needs to be viewed together with the previous indicator.

Redo size

The size of all generated redo in bytes

Redo synch time

Time elapsed by Redo synch calls in 10 milliseconds (centiseconds)

Redo sync writes

Generally speaking, after the redo information is generated, it will be copied to the log buffer, and it does not need to be written to the redo log file immediately. Lgwr will write the data to the redo log file periodically. However, if the transaction commits, the redo information must be written to the REDO LOG file immediately, and the redo sync writes counter will be incremented.

Redo wastage

The LOG BUFFER free space calculated when LOG BUFFER data is written to the REDO LOG file

Redo writer latching time

LGWR gets the time (in centiseconds) of each COPY LATCH. If there is a problem with this metric, you need to check whether the IO performance of the REDO LOG file and the size of the LOG BUFFER are sufficient. This indicator is meaningful only when LOG_SIMULTANEOUS_COPIES > 0.

Redo writes

Count of LOG BUFFER written to REDO LOG files by LGWR

Remote instance undo block writes

If a remote instance needs to read a UNDO BLOCK and needs the instance to write back the "dirty" UNDO BLOCK first, the counter will be incremented.

Remote instance undo header writes

Similar to the previous indicator, except that it is written as UNDO HEADER

Remote instance undo requests

Due to the number of UNDO requests from remote instances for CR, if this indicator is large, it means that some data blocks in RAC are often shared among instances, and the modified data of one instance is being used by other instances. In this case, you need to pay attention to the performance of CLUSTER INTERCONNECT.

Rollback changes-undo records applied

When the user needs ROLLBACK, the number of UNDO records submitted. When the transaction needs to be rolled back, the data needs to be taken out of the UNDO and submitted to the modified data. This count is related to the number of ROLLBACK in the system and the number of records per ROLLBACK.

Rollbacks only-consistent read gets

When the user needs to CR READ, the number of UNDO records submitted, no BUFFER CLEAR operation occurs at this time. When doing consistent reading, you also need to take the relevant data from the UNDO, generate a CR BLOCK, and submit the data to the CR BLOCK, and this indicator will be increased.

Rows fetched via callback

The number of records returned by the CALLBACK function. This statistic is only used for internal DEBUG.

Session cursor cache count

The count of the session CURSOR buffer is meaningful only if session_cached_cursors is greater than 0. If this value is close to or even reaches the value of session_cached_cursors in the parameter, then this parameter may need to be increased.

Session cursor cache hits

This indicator is meaningful only if session_cached_cursors is greater than 0. The session finds the count of a certain SQL directly from SESSION CURSOR CACHE. This value shows the effect of CURSOR CACHE, which can be compared with parse count (total).

Sorts (disk)

The number of disk sorts. If the average value of this metric per second is high, you need to check whether the PGA_AGGREGATE_TARGET parameter is set too low, or whether * _ AREA_SIZE is set too low (PGA manual management mode). When checking this indicator, you should also check the IO performance of relevant files in TEMP tablespaces. If the IO performance of TEMP tablespace files is insufficient, you need to increase the configuration of PGA to reduce hard disk sorting.

Sorts (memory)

Statistics of memory sorting

Summed dirty queue length

Each time a write request occurs, the sum of the length of the dirty data block linked list divided by the number of write requests. The greater the statistical value, the more dirty blocks need to be written back in the system.

Switch current to new buffer

Move the current BUFFER to another new BUFFER, and the original BUFFER becomes a CR BLOCK, the number of times this operation

Table fetch by rowid

Accessing a table through ROWID is usually done through an index, and in another case, SQL accesses a table directly through rowid.

Table fetch continued row

If you access a row of data and you need to access multiple BLOCK, this counter is incremented. One of the main reasons for this is row chain and row migration. If the PCT_FREE setting of a table is unreasonable, it may lead to row migration during UPDATE, so this count will be increased.

The second possibility is that the length of a row count is too large for BLOCK SIZE, so there is a greater chance that a row of data will be stored in multiple BLOCK, in which case the table should be stored in a larger BLOCK SIZE table space.

Another possibility is that there are more visits with LOB fields in the system, because large LOB fields are generally stored in a separate SEGMENT, so accessing this kind of data will also increase this statistical value.

Total file opens

Number of files opened by INSTANCE (including data files, control files, REDO LOG, etc.)

Transaction rollbacks

Total number of transactions that were successfully rolled back

Write clones created in background

If the current BUFFER is being written, the background or foreground process clones a new BUFFER so that the writing of the original BUFFER can continue

Enqueue conversions

Conversion statistics for table or row locks

Enqueue deadlocks

Deadlock statistics

Enqueue releases

Lock release statistics

Enqueue requests

Lock application statistics

Enqueue timeouts

Lock application timeout statistics

Enqueue waits

Lock waiting statistics

The above is how to analyze the major events in AWR. Have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.