ORACLE-AWR report analysis 07/08 Update SLTechnology News&Howtos

ORACLE-AWR report analysis

2025-07-08 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

1. What is AWR?

AWR (Automatic Workload Repository) is the abbreviation of automatic load information base. AWR report is a performance collection and analysis tool provided by Oracle 10g later, which can provide a report on the use of the entire system resources within a period of time. Through the report, you can understand the whole operation of a system, and the generated report includes multiple parts.

AWR samples the v$active_session_history view (the ASH in memory collects information, theoretically 1 hour) every hour, saves the information to disk, and retains it for 7 days before the old record is overwritten. This sample information is saved in the wrh$_active_session_history view (the ASH information written to the AWR library, theoretically for more than an hour). The sampling frequency (1 hour) and retention time (7 days) can be adjusted according to the actual situation, which provides DBA with a more effective system monitoring tool.

2. Under what circumstances will AWR be used?

DBA monitoring and understanding of the running status and status of the database, when the database bottleneck is found in the testing process, but can not locate the specific reason, we can use the AWR report for analysis and positioning.

Database performance problems, generally in three places: IO, memory, CPU, these three places are closely related. Assuming that there are no physical failures in all three places, as the IO load increases, more memory will be needed to store it, and CPU will need to spend more time filtering the data. On the contrary, if CPU takes too much time, it may be parsing SQL statements, or filtering too much data, not necessarily related to IO or memory.

CPU: parse the SQL statement, try multiple execution plans, and finally generate an execution plan that the database thinks is good, but not necessarily optimal. Because when there are too many associated tables, the database will not exhaust all the execution plans, which will take too much time. How does oracle know that this data is what you want and the other one is not what you want? it needs to be filtered by cpu.

Memory: both SQL statements and execution plans need to be kept in memory for a period of time, and the data obtained will also be kept in memory as far as possible according to the LRU algorithm. In the process of executing SQL statements, joins and sorting between various tables also take up memory.

IO: if the data you need is not in memory, you need to fetch it from disk, which will involve physical IO. When there is too much join data between tables, and when there is not enough data in sorting and other operations, temporary tablespaces are needed, which will also consume physical io.

As explained here, the PGA allocated by ORACLE generally accounts for only 20% of the memory. For dedicated server mode, every time operations such as SQL statements and table data operations are performed in PGA, that is, only about 20% of memory can be allocated with ORACL. If multiple users perform multi-table association, and there is a lot of table data, coupled with improper association, memory will become a bottleneck, so it is very important to optimize SQL Reduce logical and physical readings.

3. How to generate awr report?

The first step is to log in to the ORACLE database server and remember the current directory or change to the directory that AWR wants to save

The second step is to connect the SQLplus user name / password @ service connection name to the oracle database instance, as shown in the following figure

The third step, you will be prompted to execute @? / rdbms/admin/awrrpt;,

The following types of AWR reports can be generated, most of which are AWR reports of raw cost instances

@? / rdbms/admin/awrrpt; this instance AWR includes

Select the instance number in @? / rdbms/admin/awrrpti; RAC

@? / rdbms/admin/awrddrpt; AWR comparison report

@? / RDBMS/admin/awrgrpt; RAC Global AWR report

Enter the format in which the AWR report is generated in html, as shown in the following figure:

Enter the number of days: enter it according to the actual situation (for example, 1, for that day, and if 2, for today and yesterday, to push forward) as shown in the following figure:

Enter the start and end values: (will be listed after entering the number of days, snapvalue)

Enter the name of the AWR report: name Custom enter and then automatically produce the AWR report, as shown in the following figure:

It is explained here that snapshot nodes can be created manually and execute the following commands according to the actual situation:

Exec dbms_workload_repository.create_snapshot; can create a snapshot manually.

After finishing, you can go to the specified directory and click on the AWR report file, which is test_awr.lst, as shown in the following figure:

Change the extension to HTML, download it to the Windows platform to view it, and open the AWR report with IE, as shown in the following figure:

4. Analyze the AWR report

The AWR report is very rich. Here we choose a small part of it to explain. Before analyzing the AWR report, we should first understand the hard parsing and soft parsing of Oracle. First of all, let's talk about the process of Oracle's processing of SQL. When you issue a SQL statement to deliver the Oracle, Oracle will process the SQL in several steps before executing and getting the results:

1. Syntax checking (syntax check)

Check whether the spelling of this SQL is grammatical.

2. Semantic checking (semantic check)

Such as checking whether the access object in the SQL statement exists and whether the user has the corresponding permissions.

3. Parse the SQL statement (prase)

The internal algorithm is used to parse the SQL to generate the parsing tree (parse tree) and the execution plan (execution plan).

4. Execute SQL and return the result (execute and return)

Among them, soft and hard parsing takes place in the third process.

Oracle uses the internal hash algorithm to get the hash value of the SQL, and then looks in library cache to see if the hash value exists.

Assuming it exists, compare this SQL with the one in cache

Assuming "the same", the existing parse tree and execution plan will be utilized, while the work related to the optimizer will be omitted. This is the process of soft parsing.

Of course, if either of the above two assumptions is not true, the optimizer will create a parse tree and generate an execution plan. This process is called hard parsing.

Creating parse trees and generating execution plans are expensive actions for SQL execution, so hard parsing should be avoided and soft parsing should be used as much as possible.

Open the AWR report header as shown in the following figure:

Elapsed snapshot monitoring time: in order to diagnose performance problems during a specific period of time, the Elapsed should not be too long for 15 minutes to 2 or 3 hours. If you look at the load throughout the day, it can be longer, the most common is 60 minutes and 120 minutes.

DB Time: does not include the time consumed by the Oracle background process. If the DB Time is much less than the Elapsed time, the database is idle.

DB Time= cpu time + wait time (excluding idle waits) (non-background processes). DB Time is the time the server spent on database operations (non-background processes) and waits (non-idle waits). DB time = cpu time + all of nonidle wait event time in 79 minutes (during which 3 snapshot data were collected), the database took 11 minutes, and the RDA data shows that the system has 8 logical CPU (4 physical CPU). The average time per CPU is 1.4 minutes, and the CPU utilization is only about 2% (1.4 + 79), indicating that the system pressure is very low.

However, for batch systems, the workload of the database is always concentrated in a period of time, if the snapshot cycle is not within this period of time, or if the snapshot cycle span is too long and contains a large amount of database idle time, the analysis results are meaningless, which also shows that it is critical to choose the analysis period, which can represent the performance problem.

Displays the size of each area in the SGA, which can be compared to the initial parameter value.

Shared pool mainly includes library cache and dictionary cache. Library cache is used to store SQL, PL/SQL, Java classes, etc., after recent parsing (or compilation). Library cache is used to store recently referenced data dictionaries. Cache miss that occurs in library cache or dictionary cache is much more expensive than what happens in buffer cache, so shared pool is set to ensure that most recently used data can be cache.

It is more meaningful to display the database load profile and compare it with the baseline data. if the load changes little per second or per transaction, it means that the application runs stably. A single report data only shows the load of the application, and most of the data does not have a so-called "correct" value, but a Logons greater than 2 per second, a Hard parses greater than 100 per second, and a total parses of more than 300 per second indicate that there may be a contention problem.

Redo size: the log size (in bytes) generated per second, which can indicate the frequency of data changes and whether the database task is heavy or not.

Logical reads: the number of blocks per second / per transaction read logically. The number of block produced by logical reads per second. Logical Reads= Consistent Gets + DB Block Gets

Block changes: blocks modified per second / transaction

Physical reads: blocks per second / physical read per transaction

Physical writes: blocks per second / physical writes per transaction

User calls: call per second / user per transaction

The number of times Parses:SQL parsed. The number of parses per second, including the combination of fast parse,soft parse and hard parse. Soft parsing more than 300 times per second means that your "application" is not efficient, adjust the session_cursor_cache. In this case, fast parse refers to a situation that is hit directly in PGA (session_cached_cursors=n is set), soft parse refers to a situation that is hit in shared pool, and hard parse refers to a situation in which neither of them is hit.

Hard parses: the number of hard parsing is too much, indicating that the reuse rate of SQL is not high. The number of hard parsing per second, more than 100 per second, may indicate that your binding is not good, or that the shared pool setting is unreasonable. At this point, you can enable the parameter cursor_sharing=similar | force, which defaults to exact. However, when this parameter is set to similar, there is a bug, which may lead to poor execution of the plan.

Sorts: number of sorts per second / transaction

Logons: number of logins per second / transaction

Executes: SQL execution per second / transaction

Transactions: transactions per second. The number of transactions generated per second, reflecting whether the database task is heavy or not.

Blocks changed per Read: represents the proportion of data blocks that logical reads are used to modify. The percentage of blocks changed in each logical read.

Recursive Call: the percentage of recursive calls to all operations. The percentage of recursive calls, which is higher if there is a lot of PL/SQL.

Rollback per transaction: rollback rate per transaction. See if the rollback rate is very high, because the rollback consumes a lot of resources. If the rollback rate is too high, it may mean that your database has experienced too many invalid operations, and too much rollback may also lead to Undo Block competition. The formula for calculating the parameter is as follows:

Round (User rollbacks / (user commits + user rollbacks), 4) * 100%.

Rows per Sort: number of rows sorted each time

The above figure includes the memory hit ratio of the key metrics of Oracle and the efficiency of other database instance operations, in which Buffer Hit Ratio is also known as Cache Hit Ratio,Library Hit ratio and Library Cache Hit ratio. Like the Load Profile section, there is no so-called "correct" value in this section, but can only be judged according to the characteristics of the application. In a DSS environment that uses direct reads to execute large parallel queries, 20% of the Buffer Hit Ratio is acceptable, which is totally unacceptable for an OLTP system. According to Oracle's experience, for OLTPT systems, the Buffer Hit Ratio ideal should be above 90%.

Buffer Nowait represents the non-wait ratio of data obtained in memory, and the non-wait ratio of Buffer in the buffer, which generally needs to be greater than 99% of Buffer Nowait. Otherwise, there may be contention, which can be further confirmed in later waiting events.

Buffer Hit represents the rate at which the process finds blocks from memory, and it is more important to monitor this value for significant changes than the value itself. For general OLTP systems, if this value is less than 80%, more memory should be allocated to the database. The hit rate of data blocks in the data buffer should usually be more than 95%. Otherwise, less than 95%, important parameters need to be adjusted, and less than 90% may be required to add db_cache_size. A high hit rate does not necessarily mean that the performance of the system is the best. For example, a large number of non-selective indexes are accessed frequently, which will result in the illusion of a high hit rate (a large number of db file sequential read), but a relatively low hit rate will generally affect the performance of the system and need to be adjusted. A sudden change in hit rate is often a bad message. If the hit ratio suddenly increases, you can check the TOP buffer get SQL to see the statements and indexes that cause a large number of logical reads. If the hit ratio suddenly decreases, you can check the TOP physical reads SQL to check the statements that produce a large number of physical reads, mainly those that do not use the index or the index is deleted.

Redo NoWait represents the percentage of not waiting to get BUFFER in the LOG buffer. If it is too low (see the 90% threshold), consider increasing the LOG BUFFER. When the redo buffer reaches 1m, you need to write to the redo log file, so when the redo buffer setting exceeds 1m, it is unlikely to wait for the buffer space to be allocated. Currently, redo buffer, which is generally set to 2m, should not be too large for the total amount of memory.

Library Hit: represents the ratio that Oracle retrieves a parsed SQL or PL/SQL statement from Library Cache. When an application calls a SQL or stored procedure, Oracle checks Library Cache to determine if there is a parsed version, and if so, Oracle executes the statement immediately; if not, Oracle parses the statement and assigns it a shared SQL zone in Library Cache. Low library hit ratio can lead to too much parsing, increase CPU consumption, and degrade performance. If the library hit ratio is less than 90%, you may need to increase the shared pool area. Generally speaking, the hit rate of STATEMENT in the shared area should be kept above 95%, otherwise you need to consider: increasing the shared pool; using binding variables; modifying parameters such as cursor_sharing.

Latch Hit:Latch is a lock that protects memory structures and can be thought of as a SERVER process obtaining permission to access memory data structures. Make sure that the Latch Hit is more than 99%, otherwise it means Shared Pool latch contention. It may be due to unshared SQL, or the Library Cache is too small, which can be solved by binding change or scaling up Shared Pool. Make sure it is > 99%, otherwise there will be serious performance problems. When there is a problem with this value, we can find and solve the problem with the help of the later wait time and latch analysis.

Parse CPU to Parse Elapsd: parsing actual elapsed time / (parsing actual running time + waiting time for resources in parsing), the higher the better. The calculation formula is: Parse CPU to Parse Elapsd% = 100* (parse time cpu / parse time elapsed). That is, parsing the actual running time / (parsing the actual running time + waiting for resources in the parsing). If the ratio is 100%, it means that the CPU wait time is 0 and there is no wait.

Non-Parse CPU: actual elapsed time of SQL / (actual elapsed time of SQL + SQL parsing time). Too low means that parsing takes too much time. The calculation formula is:% Non-Parse CPU = round (100*1-PARSE_CPU/TOT_CPU), 2). If this value is small, it means that parsing consumes too much CPU time. Compared with PARSE_CPU, if the TOT_CPU is high, the ratio will be close to 100%, which is good, indicating that most of the work performed by the computer is to execute the query rather than analyze the query.

Execute to Parse: is the ratio of statement execution to analysis, which can be high if the SQL reuse rate is high. The higher the value, the more times it is repeated after a parse. The calculation formula is: Execute to Parse = 100 * (1-Parses/Executions). In this example, parse is required almost five times per execution. So if the system Parses > Executions, it is possible that the ratio is less than 0. This value 1: the percentage of memory consumed by SQL with more than 1 execution times. This is a measure of how much memory is consumed by frequently used SQL statements compared to infrequently used SQL statements. This number will be very close to% SQL with executions > 1 in general, unless some query tasks consume irregular memory. In a steady state, you will see that about 75% and 85% of the shared pools are used over time. If the time window of the Statspack report is large enough to cover all cycles, the percentage of SQL statements executed more than once should be close to 100%. This is a statistic affected by the duration between observations. It can be expected to increase as the length of time between observations increases.

Through the ORACLE instance validity statistics, we can get a general impression, but we can not determine the performance of the data operation. The determination of current performance issues mainly depends on the following waiting events. We can understand the two parts in this way. Hit statistics help to identify and predict some performance problems that will occur in the system, so that we can plan ahead. The wait event, which indicates that the current database has a performance problem that needs to be solved, is a remedial nature.

TOP5 time Event, the last section of the AWR report summary, shows the five most serious waits in the system, inverted in proportion to the wait time. For a system with good performance, CPU Time should be in front of TOP 5, otherwise your system spends most of its time waiting. When we tune, we always want to observe the most significant effect, so we should start here to determine what to do next. For example, 'buffer busy wait' is a serious wait event, and you should continue to study the contents of the Buffer Wait and File/Tablespace IO areas in the report to identify which files are causing the problem. If the most serious wait event is the SQL O event, you should study the SQL statement regions sorted by physical reads to identify which statements are executing a large number of Imax O, and study the files with slower response times in the Tablespace and Imax O areas. If you have a high LATCH wait, you need to look at the detailed LATCH statistics to identify which LATCH is causing the problem.

Here, log file parallel write is a relatively long wait, taking up 7% of CPU time. We usually use AWR reports to find SQL statements that take a long time or use more resources, list the SQL statements that consume the most resources according to various resources, and show their proportion to all resources in the statistical period, which gives us a tuning guide. For example, in a system where CPU resources are the bottleneck of system performance, optimizing the SQL statements with the most buffer gets will achieve the maximum effect. In a system where waiting is the most serious event, the goal of tuning should be the SQL statement with the most physical IO.

Here is a list of time-consuming SQL, sorted from high to low TOP100, and click on the SQL ID connection in the AWR report to jump to the place of the detailed SQL statement.

In fact, this is very similar to one of the functions of Cloud Wisdom Perspective, that is, in the application request information, Perspective can capture the time-consuming and real-time of each SQL statement in the MYSQL database. This is much more convenient than AWR reporting, without setting up snapshot nodes for real-time viewing.

This place is sorted according to CPU time TOP 100SQL statement, also click on the connection in SQL ID to expand the detailed SQL statement.

The content of the AWR report is very rich. I hope the content of this article can play a role in attracting jade. We can do some practical work and further study on AWR.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.