Understand the Oracle execution plan 04/17 Update SLTechnology News&Howtos

Understand the Oracle execution plan

2025-04-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

One: what is the Oracle implementation plan?

An execution plan is a description of the execution process or access path of a query statement in Oracle

Second: how to view the Oracle execution plan?

Because I've been using PLSQL remotely connected company databases, here's an example of PLSQL:

①: configure the items that need to be displayed in the execution plan:

Tools-> preferences-> window Type-> Plan window-> configure the columns to be displayed in the execution plan as needed

Common column fields for execution plans explain:

Cardinality (Rows): the number of result set rows returned by the current operation estimated by Oracle

Bytes (Bytes): the number of bytes returned after performing this step

COST, CPU cost: the cost estimated by Oracle for the execution of this step is used to indicate the cost of SQL execution. In theory, the smaller the better (this value may be different from the actual value)

Time (Time): Oracle estimates the time required for the current operation

②: open the execution plan:

Press F5 after executing a select statement in the SQL window to view the execution plan of the query statement that has just been executed

Note: if you use the SQL command in PLSQL to view the execution plan, some SQL*PLUS commands PLSQL cannot support, such as SET AUTOTRACE ON

Three: understand the Oracle implementation plan

①: execution order:

According to the indentation of Operation, the one with the most indentation is executed first; (when the indentation is the same, the top one executes first)

Example: in the figure above, INDEX RANGE SCAN and INDEX UNIQUE SCAN are indented the most, and the top INDEX RANGE SCAN is executed first.

If an action does not have a child ID at the same level, it will be executed first.

Actions at the same level are executed in accordance with the principle of top-most, right-first execution.

For example, in the figure above, if both TABLE ACCESS BY GLOBAL INDEX ROWID and TABLE ACCESS BY INDEX ROWID are indented at the same level, the above TABLE ACCESS BY GLOBAL INDEX ROWID action is executed first; if this action contains a sub-action INDEX RANGE SCAN, then the sub-action INDEX RANGE SCAN on the right is executed first.

The sequence of SQL execution in the figure is as follows:

INDEX RANGE SCAN-> TABLE ACCESS BY GLOBAL INDEX ROWID-> INDEX UNIQUE SCAN-> TABLE ACCESS BY INDEX ROWID-> NESTED LOOPS OUTER-> SORT GROUP BY-> SELECT STATEMENT, GOAL = ALL_ROWS

(note: PLSQL provides a function button to view the order of execution (the red box in the figure above))

②: some descriptions of the actions in the figure:

1. TABLE ACCESS BY in the picture above. That is, it describes how the table accesses (or Oracle accesses data) when the action is performed

Several ways to access the table: (not all)

TABLE ACCESS FULL (full table scan) TABLE ACCESS BY ROWID (table access via ROWID) TABLE ACCESS BY INDEX SCAN (index scan)

(1) TABLE ACCESS FULL (full table scan):

Oracle reads all the rows in the table and checks whether each row meets the Where constraint in the SQL statement

When scanning a full table, you can use multi-block reads (that is, multiple blocks of data read at a time) to improve throughput.

Use suggestion: full table scan is not recommended for tables with too much data, unless you need to fetch more data, accounting for 5% ~ 10% or more of the total table data.

(2) TABLE ACCESS BY ROWID (accessed through the ROWID table):

First of all, what is ROWID?

ROWID is the pseudo column automatically added by Oracle to the last column of each row in the table. Since it is a pseudo column, it is stated that the value of ROWID is not physically stored in the table.

You can use it like any other column, but you can't add, delete or change the value of the column.

Once a row of data is inserted, its corresponding ROWID is unique for the lifetime of the row, and the Row ID value of the row remains the same even if the row migration occurs.

Let's go back to TABLE ACCESS BY ROWID:

The ROWID of a row indicates the data file, the data block and the location of the row in the block, so you can quickly locate the target data through ROWID, which is also the fastest way to access single-row data in Oracle.

(3) TABLE ACCESS BY INDEX SCAN (index scan):

In the index block, both the key value of each index and the ROWID of the row with that key value are stored.

The possible conceptual structure of an index on a numeric column is shown in the following figure:

So index scanning is actually divided into two steps:

Ⅰ: scan the index to get the corresponding ROWID

Ⅱ: navigate to a specific row to read data through ROWID

-Index scan extension-

There are five types of index scans:

INDEX UNIQUE SCAN (Index unique scan) INDEX RANGE SCAN (Index range scan) INDEX FULL SCAN (Index full scan) INDEX FAST FULL SCAN (Index Fast scan) INDEX SKIP SCAN (Index Jump scan)

A) INDEX UNIQUE SCAN (index unique scan):

Scans for unique indexes (UNIQUE INDEX) return at most one record at a time

When there are UNIQUE or PRIMARY KEY constraints in a field in a table, Oracle often scans uniqueness.

B) INDEX RANGE SCAN (index range scan):

Use an index to access multiple rows of data

There are three situations where an index range scan occurs:

A range operator is used on a unique index column (for example: >

< >

=, the connection conditions that are not applicable are: like

(2) NESTED LOOPS (nested Loop):

Internal connection process:

A) fetch row 1 (the first row of data) of row source 1, iterate through all rows of row source 2 and check for matches, and put the matching rows into the result set

B) fetch row 2 (the second row of data) of row source 1, iterate through all rows of row source 2 and check for matches, and put the matching rows into the result set

C).

If N rows of data are returned in row source 1 (that is, the driver table), row source 2 will be traversed N times by the whole table accordingly.

Because each row of row source 1 matches all rows of row source 2, it is more efficient when row source 1 returns as few rows as possible and can efficiently access row source 2, such as proper indexing.

Extend:

The tables of nested loops have a driving order, so pay attention to choosing the appropriate driving table.

One advantage of nested loop connections that other connections do not have is that you can return the connected rows first without having to wait for all the join operations to return data, thus achieving a quick response.

Constraints (Where filter conditions) should be used as much as possible to minimize the number of rows returned by the drive table (row source 1), while a unique index (UNIQUE INDEX) or a non-unique index with better selectivity is established on the join operation associated column of the matching table (row source 2), when the nested loop join becomes very efficient. If the driving table returns a large number of rows, even if there is an index on the associated column of the matching table join operation, the join efficiency will not be very high.

(3) HASH JOIN (hash join):

Hash connections are only applicable to equivalent connections (that is, the connection condition is =)

When HASH JOIN joins two tables, it does not necessarily scan the whole table, and it does not restrict table access.

A brief description of the internal connection process:

A) take the dataset of row source 1 (driver table, also known as Build Table in HASH JOIN) and build it into a Hash Table in memory (the Hash KEY of the Hash function is the join operation associated column) to create a Hash bitmap (bitmap)

B) take the dataset of row source 2 (matching table), use the same Hash function for the join operation associated column of each piece of data, and find the location of the data in the corresponding a) in the Hash Table, and check whether the matching data can be found at that location

-extended reading: Hash Table related-

Explanation from Wiki:

In computing, a hash table (hash map) is a data structure used to implement an associative array, a structure that can map keys to values. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found.

Hash technology: establish a corresponding relation f between the storage location of the record and the keyword key of the record, so that after entering key, you can get the corresponding storage location f (key), which is the hash (hash) function.

Hash technology is used to store records in a continuous storage space, which is a hash table (hash table).

The hash values obtained by different key hashes from the same hash function should be different in theory, but they may be the same in practice. Hash conflicts occur when they are the same, and there are many ways to solve hash conflicts. For example, chain address method is used to solve hash conflicts in HashMap.

Hash table is a kind of lookup-oriented data structure. After entering a given value, it is very fast to find the location of the record corresponding to the given value in the table to get a specific record.

Three modes of HASH JOIN:

OPTIMAL HASH JOINONEPASS HASH JOINMULTIPASS HASH JOIN

1) OPTIMAL HASH JOIN:

The OPTIMAL pattern is that the result set obtained from the driver table (also known as Build Table) is relatively small, and the entire Hash Table built from the result set can be built in an area of memory that the user can use.

Brief description of the connection process:

Ⅰ: first of all, use the Hash function on the join operation associated column of each row of data in Build Table to construct the result set of Build Table into an in-memory Hash Table. As shown in the figure, you can think of Hash Table as a large square area in memory, in which there are many small grids in which the data in Build Table is scattered, and these small grids are Hash Bucket (see the definition of Wiki above).

Ⅱ: start reading the data of the matching table (Probed Table), use the same Hash function for the join operation associated column of each row of data, and locate the Hash Bucket in the Build Table where the data has the same value after using the Hash function.

Ⅲ: after locating the specific Hash Bucket, first check whether there is any data in the Bucket, and immediately discard this row of the matching table (Probed Table) if not. If there is data in it, continue to check whether the data in it (the data in the driver table) matches the data in the matching table.

2): ONEPASS HASH JOIN:

The ONEPASS schema is used when the result set obtained from the driver table (also known as Build Table) is so large that all the Hash Table built from the result set cannot be put in memory.

Brief description of the connection process:

Ⅰ: use the Hash function to associate columns with the join operations of rows of data in Build Table. After building Hash Table based on the result set of Build Table, because memory cannot drop all Hash Table contents, some Hash Bucket will be put in memory and some Hash Bucket on disk. No matter in memory or on disk, Oracle uses a Bitmap structure to reflect the status of these Hash Bucket (including its location and whether data is available).

Ⅱ: read the matching table data and use the same Hash function for the join operation associated column of each row, and locate the Bucket with the same value data in the Build Table on the Bitmap after using the Hash function. If the Bucket is empty, the data of the matching table is discarded. If it is not empty, you need to see if the Bucket is in memory or on disk.

If you are in memory, access the Bucket directly and check whether the data in it matches, and return the query result if there is a match.

If you are on disk, put this piece of data to be matched aside, store it in memory temporarily, and then write it in batches to disk after a certain amount of data has been accumulated (Dump probe partitions to disk in the image above).

Ⅲ: after a complete scan of the matching table, some of the matching data may have been returned. Next, some of the Hash Bucket data in Hash Table on disk and some of the matching data written to disk in the matching table are not processed. Now Oracle will rematch these two parts of data on disk and return the final query result.

3): MULTIPASS HASH JOIN:

MULTIPASS mode is used when the memory is very small or the data of Hash Table is relatively large. MULTIPASS reads disk data multiple times and should avoid using this mode as much as possible.

3. In the picture above... OUTER describes the table join type

Two types of table joins:

INNER JOIN (internal connection) OUTER JOIN (external connection)

The sample data shows:

There are two tables An and B, and the information in Table An is as follows:

The information in Table B is as follows:

The following examples are demonstrated by tables An and B.

(1) INNER JOIN (internal connection):

Only matching records in the two tables are returned.

There are two types of INNER JOIN:

Equivalent connection (connection condition =) non-equivalent connection (connection condition is non-=, such as > > <

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.