Oracle parallel Foundation I 07/06 Update SLTechnology News&Howtos

Oracle parallel Foundation I

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Oracle parallel Foundation (Serial 1)

Author: Wei Xinghua, senior database technology expert of Walk Science and Technology

Overview

Oracle Enterprise Edition has a very powerful technology: parallel queries, which means that a statement can employ multiple server processes (parallel slaves, also known as PX slaves) to complete the results needed for this query. Parallel operation can not only make full use of the host's CPU resources, but also make full use of the system's IO resources and memory resources, which seems to be an advantage, but it also depends on the situation. If the database system does not have too much free CPU, free IO or free memory resources, it is worth considering whether the parallel technology should be used. Even if the system has a lot of CPU free resources, the IO resources are far from enough. Then you also need to consider whether to use parallelism (which tends to produce a lot of IO). In view of the way parallel operations work, it should not be abused in the system, otherwise the resources of the system may be quickly exhausted. Parallel operation itself is complex, it has many concepts that serial execution does not have, such as table queue, data distribution and so on, and the way to read parallel statement execution plan may be different from serial execution.

The purpose of parallel operation is to improve the linearity of statement execution. If the serial execution time of a statement is 4 minutes, then by specifying 4 parallel operations, the query execution time can be accelerated by 1 minute. Of course, this is only an expectation, and the reality often cannot achieve this kind of linearity. There are some consumption and facts to know:

It takes some time to hire parallel processes themselves, which are often short. If there are no parallel processes available in the process pool, you also need the operating system to spawn the needed parallel processes, and the database may encounter os thread startup waiting. If the statement execution time is only a few seconds, you should consider whether it is suitable for using parallelism.

The QC process assigns work to PX slaves, which takes some time, which is generally very short. For example, the QC process needs to assign a range of scanning ROWID to each PX slave process.

If a parallel query returns a large amount of data to the client, the only QC process itself may become a bottleneck.

Because the parallel execution of Oracle uses the producer-consumer model, a query with a DOP of 4 will eventually employ an PX slaves of 8. Coupled with the QC process itself, it will take up a total of 9 system processes. You have to realize whether it is worth the effort.

Even if serial queries are used in Exadata, because parallel is the default at the IO level, the parallelism of statements under Exadata is not as good as that of non-Exadata.

In order for parallelism to work very well, there are some requirements that need to be met:

A very efficient execution plan, and if the execution plan itself is very bad, the use of parallelism may not improve the efficiency of statement execution to a great extent.

The database system has sufficient resources available. This has been mentioned at the beginning of the article.

There is no obvious tilt in the distribution of workload, and everyone is familiar with the short board theory. If a PX slave does most of the work, then the biggest bottleneck in the final response time is it.

Perhaps many of the above concepts and terms are not clear to you, it does not matter, we will introduce the following content. The first problem that should be considered when using parallelism is how to allocate workload. In the case of serial execution, there is only one process to work, and all the work is done by it, but if parallel operations are used, it means that there are multiple processes doing the same thing, and the allocation of work is very important.

Parallel operation of single table

For the parallel operation of a single table, the segmentation of workload is relatively simple, and Oracle does not design any complex algorithm, which is generally based on the

ROWID or partition (if it is a partition table) to assign work. For example, the following parallel query:

The above SQL and its execution plan show that the table test is counted with a degree of parallelism of 2. The row source Operation part of Id 5 is: PX BLOCK ITERATOR, which is often seen in parallel operations, which represents that the QC process splits the table according to ROWID, and each PX slave scans the table in a different range. Then each PX slave aggregates the number of records it scanned (Id=4,SORT AGGREGATE), and finally sends the results to QC,QC to further aggregate the results of these PX slaves to form a record and return it to the client.

What you can see through SQL MONITORING is more intuitive (below). Most of the work is done through the blue PX slaves, and then these PX slaves send the results of their pre-aggregation to the QC (row source ID is 3) for final aggregation.

However, as we follow up, we will find that this example here employs only one set of PX slaves processes, which is a special case in the world of Oracle parallelism. According to the producer and consumer model of Oracle, two groups of PX slaves are generally employed, one as the producer to scan the data, and the other as the consumer to process the data received from the producer. (however, this example can treat QC as a consumer.)

This article makes extensive use of the SQL MOMITORING tool, and if you are not familiar with this tool, please refer to my other article:

Http://www.jianshu.com/p/ce85dd0c05ab

Let's make a simple modification of SQL and add the ORDER BY section to see what happens.

SQL > select / * + parallel (a 2) * / * from hash_t1 an order by object_name

Similarly, we use SQL MONITORING to visually parse, the [operation] column appears two different colors of PX slaves, red PX slaves as a producer is scanning the table HASH_T1, and then the scanned data is distributed to blue PX slaves consumers, PX slaves consumers receive the data and sort and then send the result set to QC.

In Oracle parallel execution, a parallel operation unit (tree) is called Data Flow Operator, a QC represents a DFO unit, and a query can have multiple DFO units (DFO tree). For example, a typical union all statement can have multiple DFO units, and different DFO units can also be parallel.

With the concept of Oracle parallel execution producer and consumer, if you continue to look at the [name] column in the figure above, you will find something with TQ10001,TQ10000. What is this?

Table queue

It has been mentioned above that Oracle parallel operation has the concept of producer and consumer. Producers and consumers represent a set of processes, and they need to transfer messages and data between them, so what do they rely on to deliver messages and data? That's what table queue does.

Continue the above figure as an example:

There are two sets of PX slaves, one is the red producer, the other is the blue consumer, the producer scans the table HASH_T1 through the row source with ID 6 and 7, and writes the scan result into table queue TQ10000 (PX SEND RANGE) through the row source with ID 5, the consumer reads the data from table queue TQ10000 and then sorts (PX RECEIVE), and the consumer sends the sorted results to the QC process through table queue TQ10001 The QC process aggregates the received results and sends them to the client.

How to split multiple tables? why introduce data distribution algorithm?

For single table (no JOIN) data segmentation is very simple, only need to do segmentation according to ROWID to ensure the correct results, because there is no data crossover between multiple parallel slaves, there will be no data loss, and according to ROWID segmentation is also very easy to ensure that the workload of each PX slave is uniform. But what if it's a two-watch JOIN? How do you ensure that the records of the X table of 1max N and the corresponding Y table of 1max N are in a parallel operation (that is, processed by a parallel process)? There is no guarantee that both tables will be split according to ROWID.

To make the example simple enough, you can describe it with the following example:

Set all in one:

[1,3,5,7,9,11]

Collection two:

[1,9,3,6,7,8,5]

If parallelism 2 is required to determine how much data intersects between [set two] and [set one], how to achieve it?

We simulate the segmentation through ROWID, and divide [set one] into two parts in order:

Set 1purl 3pyrrine 5 = > process 1

Set 2Rank 7, 9, 11 = > process 2

Using the same method again, we split [set two] into two parts:

Set 3purl 9pr 3 = > process 1

Set 4Rank 6, 7, 8, 5 = > process 2

Through the above series of operations, we split the two sets into two parts, and then we do join for set 1 and set 3 through process 1, and join,OK for set 2 and set 4 through process 2.

Obviously not, because the final result set is wrong.

The correct result of the two sets to do JOIN is: 3, 5, 7, 9

However, according to the above algorithm, the result set of set 1 and set 3 is 3, the result set of set 2 and set 4 is 7, and the final result set is 3 ~ 7, and two results of 5 ~ 9 are lost.

Therefore, we can not cut the data at will in order to speed up the query without ensuring the correctness of the results. So how does Oracle do it? How to ensure that the process reads the 1max N data of the X table and the 1max N data corresponding to the Y table?

From this, we can see the importance of introducing the data distribution algorithm, and explain why running parallelism N requires 2N parallel slave to complete the work, one group of processes is used to scan table X, and then the data is distributed to another group of processes Y according to the distribution algorithm, so that after the data distribution of Table X is completed, the table records of Y should decide their distribution mode according to the distribution algorithm of Table X. If you see something here, you may not understand it, it doesn't matter, there is enough content to make you understand these operations.

BroadCast

Continue with the previous example

[set one]:

1,3,5,7,9,11

[collection two]:

1,9,3,6,7,8,5

The distribution method of broadcast is (assuming that the degree of parallelism is 2):

Oracle first needs to generate two sets of PX slaves, one for the producer contains two PX slave processes, the other for the consumer, and also contains two PX slave processes (note that the roles of the producer and consumer are interchangeable).

Each producer PX slave is segmented according to ROWID, scans [Collection one] of 1Compact 2, and then broadcasts to each consumer's PX slave, eventually each consumer's PX slave has a full [Collection one].

Then the PX slave process of each consumer is segmented according to ROWID, scans [set two], and then makes an association judgment with [set one], and finally obtains the result set.

The key here is that each consumer's PX slave holds a full set of set one, so there is no need for any distribution of set two, just scan according to ROWID and then JOIN to ensure the correctness of the results.

Set all in one:

[1,3,5,7,9,11]

After distribution:

Set 1: 1, 3, 5, 7, 9, 11 = > process 1

Set 2purl 1, 3, 5, 7, 7, 9, 11 = > process 2

Collection two:

[1,9,3,6,7,8,5]

After distribution:

Set 3: 1, 9, 3, 6 = > process 1

Set 4: 7, 8, 5 = > process 2

The result set of set 1 recorder set 3 is 1 meme 3 and the result set of set 4 is 5, and the final result set is 1 Magi 3 Magi 5 Magi 9, which not only divides the workload evenly, but also ensures the correctness of the results.

Here, let's take a look at the whole process through a specific query example:

The data volume of table T1 is 70 and that of table T4 is 343000.

SQL > select / * + parallel (2) pq_distribute (T1 none broadcast) full (T1) full (T1) monitor*/ count (*) from T1 where t1.id = t4.id1

We force the table on the left side of hash join to broadcast distribution by adding hint pq_distribute (T1 none broadcast). Based on the output of SQL MONITORING, we do the following analysis:

(row ID 9, 8, 7), the producer red process ([operation] column) does split scan table T1 according to ROWID, and then writes the scan results to table queue for broadcast distribution. The row source PX SEND BROADCAST operation with ID of 7 represents the distribution mode of broadcast.

In the row ID6,5, the blue consumer process ([Action] column) receives the data from the red PX slave broadcast, and then builds the HASH TABLE. Each blue consumer PX slave receives a full amount of data from the T1 table, which can be shown according to the [actual number of rows] column. A total of 70 rows of data from table T1 is broadcast and distributed, resulting in a record of 70 rows (parallelism) = 140 rows.

In line ID 11 ~ 10, the blue consumption process scans the T4 table according to ROWID cutting and JOIN with the previously built HASH TABLE.

It is important to realize that there is no distribution of T4 here, and that the blue consumption process only needs to be scanned by ROWID range, because the data of the T1 table is kept at full volume in each consumer's PX slave.

Here we make a phased summary:

For broadcast distribution:

The table on the right of HASH JOIN does not need to be distributed.

In the BROADCAST way, there is no risk of incorrect results, because each consumer PX slave holds all the data in the left-hand table of HASH JOIN, and each consumer process holds a complete HASH TABLE.

If the table on the left side of HASH JOIN is small, it is not expensive to distribute. However, with the improvement of parallelism DOP or the increase of the amount of data in the left table, the cost of distribution will become higher and higher.

If the table on the left is small, BROADCAST's execution plan is very scalable.

The first group of PX processes scans the left table of HASH JOIN and broadcasts to the second group of PX slave,CPU. There is a consumption of memory and competition. The competitive consumption comes from the fact that the scanned data of each process of the first group is broadcast to each process of the second group, as shown below:

Replicate

Replicate represents that each parallel process scans all the tables on the left side of hash join, not according to rowid, because the data is scanned by each process, so there is no need to distribute the data, so only a set of PX slaves is needed.

Select / * + parallel (2) * / count (*) from hash_t1 a reptest b where a.id=b.id

The observation operation column has only a set of blue PX slaves processes, and data distribution is not involved here:

2 processes scan the reptest table fully, and then build hash table (full hash table)

After the scan is complete, the two processes scan the hash_t1 table according to the ROWID range, and since the two processes hold the full hash table of the reptest table, there is no need to distribute the hash_t1 table. Probe for hash table during scanning the hash_t1 table.

Hash distribution

As mentioned above, one problem with broadcast/replicate distribution is that because each PX slaves of the consumer holds a record of the complete left table, it is suitable for situations where the left table is relatively small. For HASH connections between two large tables, Oracle generally uses the HASH distribution method. For example, it is still the example above:

[set one]:

1,3,5,7,9,11

[collection two]:

1,9,3,6,7,8,5

After [set one] and [set 2] are distributed according to the same HASH function, they can always ensure that the related data pairs are together, so that the correctness of the result set can be guaranteed. However, in this way, there is an extra cost, that is, HASH distribution is also needed for [Collection 2], which will consume more CPU resources. Compared with the broadcast distribution method, only [set one] needs to do distribution.

Let's look at a specific example:

Select / * + parallel (3) pq_distribute (b hash hash) * / count (*) from hash_t1 a, hash_t2 b where a.id=b.id

First, the red producer PX slaves splits the parallel scan table HASH_T1 according to ROWID, and then distributes the records through table queue TQ10000 to a specific blue consumer PX slave according to the HASH algorithm.

The blue consumer receives data from the table queue TQ10000 and builds the HASH TABLE.

After the above two steps are completed, the red producer PX slaves continues to split the parallel scan table HASH_T2 according to the ROWID, and then distributes the records to the specific blue consumer PX slave through table queue TQ10001 according to the HASH algorithm. The blue consumer PX slave receives the data from the table queue TQ10001 and JOIN with the previously built HASH TABLE. Finally, each blue consumer PX slave sends the results of its own aggregation to QC via table queue TQ10002.

Be careful

[actual number of rows] column, the record has not increased after it has been distributed according to HASH.

For the hash_t2 scanning process, because the data needs to be distributed, there will be two sets of PX slaves active at the same time.

HASH distribution is very extensible, each process has a partial HASH table instead of a complete HASH table, and each row is only distributed to a specific PX SLAVE. Instead of broadcasting every line to every SLAVE like broadcast distribution.

But as pointed out above, after the table on the left of HASH JOIN is distributed, the table on the right of HASH JOIN also needs to be distributed, which adds to the cost of one more distribution and increases the cost of CPU and memory.

HYBRID-HASH

The wrong distribution method may bring great performance problems to parallel execution. Oracle 12C introduces the adaptive parallel distribution method, hybrid hash, in the real execution process, and then decide which distribution method to use. Oracle optimizer to do this, using statistics collector, which counts some run-time information of statements, such as the number of records returned, and so on. It is important to note that after using HYBRID-HASH, each statement execution will dynamically determine the parallel distribution method to be used through statistics collector.

For example, in the above execution plan, observe the row source ID 7, and count the number of returned result sets during parallel execution. If the number of returned result sets is less than parallelism * 2, then the broadcast method is used for data distribution. Otherwise, the HASH data distribution method is used in response. After the row source ID of 6 is determined, the row source ID 11 will decide whether to use round-robin or hash distribution.

Reading order of parallel execution plan

When I first started DBA, some old DBAs told me how to look at parallel execution plans, that is, to erase all PX-related operations, and then look at them, for example:

Is it really equivalent? We remove all the relevant PX and other operations, and finally the following text is "equivalent" to the serial execution:

During serial execution, the order of execution of the above execution plan is as follows:

Scan the T3 table, build the hash table, scan the T2 table, build the hash table, scan the T1 table to build the hash table, and finally scan the T4 table. Each record scanned will detect the three hash table generated previously.

But the order of parallel execution is not necessarily in the order described above. The reading of parallel execution plan should follow the order of table queue creation, which represents the order of data distribution in parallel execution. So for the above parallel execution, the order of execution is:

Scan the table T1 to build the hash table, and scan T1 first because the number TQ10000 of the table queue is the lowest.

Know according to the location of table queue TQ10001, then scan table T4 and do hash join with the hash table above.

Based on the location of the table queue TQ10002, scan table T2 to build the hash table, and then hash join the result set produced in the above two steps with the hash table.

According to the location of the table queue TQ10002, finally scan table T3, build the hash table, and then hash join the result set produced by the above three steps with this hash table.

V$pq_tqstat

The v$pq_tqstat view is very special, its contents are only recorded in the private PGA of the QC process, and the contents are populated only after the parallel query ends, so if you cancel the query during parallel execution, then the query view will still not have any results, because it only exists in the process's PGA, so you cannot query it through another session.

You can use it to understand how data is distributed through table queue during parallel execution. Here are two examples:

For example, we record and count the hash_t1 with a parallelism degree of 4. After the execution is completed, check the v$pq_tqstat view:

Four producers aggregate the scanned records and write it to table queue,QC to receive the four records through table queue. Note that NUM_ROWS represents the amount of data written and read by PX slaves through table queue. You can easily see whether the workload of parallel processes is uniform and whether there is parallel skew by the value of NUM_ROWS.

Let's look at a more complicated example:

Observe the hint part of SQL and force the left table of hash join to use the broadcast distribution method. Combined with the above output and the figure below, each red producer first scans HASH_T2 according to the ROWID range, and then broadcasts the data to blue consumers through table queue TQ10000. Due to the parallelism of 4, the amount of data that each producer actually writes into table queue is 4 times the amount of scanned data (9999999 is the real number of records). 9999999 * 4 records are generated through broadcast distribution), consumers receive data from table queueTQ10000 and build hash table, each consumer PX slave builds a complete hash table of table hash_t2, and then blue consumers start scanning hash_t1 and do join with the previously built hash table Finally, each blue consumer sends his or her final preaggregation result to QC (in this case, the role has been transformed into a producer), and QC receives four records.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.