In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Sort operation
Sorting is one of the most basic operations undertaken by computers, especially in the field of database processing, oracle is no exception. Operations that may require oracle to sort data include the following
(1) create an index
(2) grouping or aggregating data through group by,unique or distinct keywords
(3) due to the use of the order by clause, the data is returned in the arranged order.
(4) join tables or result sets by sorting and merging
(5) use collections to operate union,intersect or minus
(6) execute specific subqueries
Queries may require considerable resources, as specified below:
CPU is always consumed. The number of CPU required is proportional to the size of the result set that needs to be sorted.
Oracle allocates a block of memory for sorting. This memory comes from the program global area (PGA). The total amount of PGA memory available usually depends on the value of MEMORY_TARGET or PGA_AGGREGATE_TARGET
If the memory area is insufficient to complete the sorting, oracle allocates one or more temporary periods to the temporary tablespace. This is called disk sorting. Disk sorting adds additional overhead, including the IO consumption of allocating space during temporary periods and writing data to temporary tablespaces and disk readout.
1.1 optimal, one or more sorting
The amount of memory available for sorting is the most important factor that determines sorting performance. Performance will be optimal if the amount of memory used for sorting is sufficient for sorting to be done in memory, and oracle actually calls this type of sorting optimal sorting.
If there is not enough memory, oracle must read and write the temporary period when performing a sort operation. In a sort, orale only needs to write (and then read back) a short. In multiple sorting, oracle needs to write and read back multiple sort segments. The more sorting times you need. The more IO is included in the sort, the worse the sorting performance.
The IO required for sorting increases sharply with the increase of sorting times, and sorting times eventually become the main factor affecting the performance of sql.
1.2 measure sorting activity
Now let's see how sorting activities can be measured.
The optimizer calculates the amount of memory it thinks it will need, and you can see this by checking the tempSpc field of the dbms_xplan output (even before you execute sql)
SQL > explain plan for select * from customers order by cust_last_name,cust_first_name,cust_year_of_birth
Explained.
SQL > select * from table (dbms_xplan.display ())
PLAN_TABLE_OUTPUT
-
Plan hash value: 2792773903
-
| | Id | Operation | Name | Rows | Bytes | TempSpc | Cost (% CPU) | Time |
-
| | 0 | SELECT STATEMENT | | 55500 | 9810K | | 2609 (1) | 00:00:02 |
| | 1 | SORT ORDER BY | | 55500 | 9810K | 12m | 2609 (1) | 00:00:02 |
| | 2 | TABLE ACCESS FULL | CUSTOMERS | 55500 | 9810K | | 405K (1) | 00:00:01 |
-
9 rows selected.
The value generated by DBMS_XPLAN is an estimate, but it gives you a general idea of the memory you need.
The V$SQL_WORKAREA view contains detailed statistics about memory usage and sorting behavior that have occurred in the database.
WITH sql_workarea AS
(SELECT sql_id | |'-'| | child_number sql_id_child
Operation_type operation
Last_execution last_exec
Round (active_time / 1000000, 2) seconds
Optimal_executions | |'/'| | onepass_executions | |'/'| |
Multipasses_executions olm
Substr (sql_text, 1,155) sql_text
Rank () over (ORDER BY active_time DESC) ranking
FROM v$sql_workarea
JOIN v$sql
USING (sql_id, child_number))
SELECT sql_id_child "SQL_ID-CHILD"
Seconds
Operation
Last_exec
Olm "Oswap 1mop M"
Sql_text
FROM sql_workarea
WHERE ranking oradebug event 10032 trace name context forever,level 12
-Sort Parameters
Sort_area_size 12691456
Sort_area_retained_size 12691456
Sort_multiblock_read_count 1
Max intermediate merge width 773
* 2014-12-15 13 13 14-34-36. 765
Sorcls: sorp 0x7fd8df3ea1c8
-Sort Statistics
Input records 55500
Output records 55500
Total number of comparisons performed 656484
Comparisons performed by in-memory sort 656484
Total amount of memory used 12691456
Uses version 2 sort
-End of Sort Statistics
1.4 use indexes to circumvent sorting
If indexes exist on some or all of the columns in the order by clause, it is possible for oracle to use indexes to get records in the required order, thus avoiding sorting operations.
If the index appears on the same column as the column in the order by clause, oracle can read the records directly from the index in the order in which they are sorted. However, reading records in the order of keystrokes requires a full scan of index leaf blocks one by one. Although fast full scan is much more efficient than regular full index scan, fast full scan cannot return records in indexed order. Therefore, it cannot be used to avoid sorting operations.
SQL > select * from customers order by cust_last_name,cust_first_name,cust_year_of_birth
55500 rows selected.
Execution Plan
Plan hash value: 2792773903
-
| | Id | Operation | Name | Rows | Bytes | TempSpc | Cost (% CPU) | Time |
-
| | 0 | SELECT STATEMENT | | 55500 | 9810K | | 2609 (1) | 00:00:02 |
| | 1 | SORT ORDER BY | | 55500 | 9810K | 12m | 2609 (1) | 00:00:02 |
| | 2 | TABLE ACCESS FULL | CUSTOMERS | 55500 | 9810K | | 405K (1) | 00:00:01 |
-
Statistics
1 recursive calls
0 db block gets
1456 consistent gets
1454 physical reads
0 redo size
6366832 bytes sent via SQL*Net to client
41213 bytes received via SQL*Net from client
3701 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
55500 rows processed
However, if we create an index on the column of order by and use the first_rows hint (or index hint), oracle will use the index instead of sorting:
SQL > create index cust_member_i on customers (cust_last_name,cust_first_name,cust_year_of_birth)
Index created.
SQL > select / * + index (customers cust_member_i) * / * from customers order by cust_last_name,cust_first_name,cust_year_of_birth
55500 rows selected.
Execution Plan
Plan hash value: 3023887059
-
| | Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time |
-
| | 0 | SELECT STATEMENT | | 55500 | 9810K | 20550 (1) | 00:00:15 |
| | 1 | TABLE ACCESS BY INDEX ROWID | CUSTOMERS | 55500 | 9810K | 20550 (1) | 00:00:15 |
| | 2 | INDEX FULL SCAN | CUST_MEMBER_I | 55500 | | 225 (0) | 00:00:01 |
-
Statistics
1 recursive calls
0 db block gets
26557 consistent gets
1701 physical reads
0 redo size
11063057 bytes sent via SQL*Net to client
41213 bytes received via SQL*Net from client
3701 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
55500 rows processed
Although sorting is no longer required by using an index, the cost of reading both the index and the table block, which has been read sequentially by the block, is much less expensive than reading the table block using a full table scan alone. In general, this means that using indexes to avoid sorting can actually lead to worse performance. However, it is faster to retrieve the first row of records using the index. Because as soon as the required record is retrieved, it returns immediately. By contrast, the sorting method requires that all records must be retrieved and sorted before any record is returned. Therefore, when the optimizer targets first_rows_n, the optimizer tends to use indexes, while when the target is all_rows, full table scans are used.
2 grouping and aggregation
The aggregate operation returns data, average, summation, maximum and minimum values based on multiple rows of records. The grouping operation allows us to return a collection of recordsets that share the same group by value.
2.1 aggregation operation
Aggregate operations (sum and average) must process each row of input data. Therefore, they are usually associated with full table scans:
SQL > select sum (quantity_sold) from sales
Execution Plan
Plan hash value: 3519235612
-
| | Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time | Pstart | Pstop |
-
| | 0 | SELECT STATEMENT | | 1 | 3 | 525 (2) | 00:00:01 |
| | 1 | SORT AGGREGATE | | 1 | 3 |
| | 2 | PARTITION RANGE ALL | | 918K | 2691K | 525 (2) | 00:00:01 | 1 | 28 |
| | 3 | TABLE ACCESS FULL | SALES | 918K | 2691K | 525 (2) | 00:00:01 | 1 | 28 | |
-
If there is an index on the aggregated column, a fast full scan of that index is usually more efficient:
2.2 maximum and minimum
Unlike most other aggregation operations, max and min operations do not need to read every row of records if there is an index on the relevant column. If there is a B-tree index, we can determine the maximum or minimum value by checking the first or last index entry, which requires only 3-5 logical reads.
SQL > select max (amount_sold) from sales; (no index)
Execution Plan
Plan hash value: 3519235612
-
| | Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time | Pstart | Pstop |
-
| | 0 | SELECT STATEMENT | | 1 | 5 | 525 (2) | 00:00:01 |
| | 1 | SORT AGGREGATE | | 1 | 5 |
| | 2 | PARTITION RANGE ALL | | 918K | 4486K | 525K (2) | 00:00:01 | 1 | 28 |
| | 3 | TABLE ACCESS FULL | SALES | 918K | 4486K | 525K (2) | 00:00:01 | 1 | 28 | |
-
Statistics
0 recursive calls
0 db block gets
1635 consistent gets
1619 physical reads
0 redo size
536 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
Where there is an index:
SQL > select max (amount_sold) from sales
Execution Plan
Plan hash value: 781264156
-
| | Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time |
-
| | 0 | SELECT STATEMENT | | 1 | 5 | 3 (0) | 00:00:01 |
| | 1 | SORT AGGREGATE | | 1 | 5 |
| | 2 | INDEX FULL SCAN (MIN/MAX) | AMOUNT_SOLD_IDX | 1 | 5 | 3 (0) | 00:00:01 |
-
Statistics
1 recursive calls
0 db block gets
3 consistent gets
2 physical reads
0 redo size
536 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
However, if we want to find the maximum and minimum values at the same time, oracle needs to spend a lot of logical reading, and the full scan index will be more expensive, and oracle chooses full table scan.
SQL > select max (amount_sold), min (amount_sold) from sales
Execution Plan
Plan hash value: 3519235612
-
| | Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time | Pstart | Pstop |
-
| | 0 | SELECT STATEMENT | | 1 | 5 | 525 (2) | 00:00:01 |
| | 1 | SORT AGGREGATE | | 1 | 5 |
| | 2 | PARTITION RANGE ALL | | 918K | 4486K | 525K (2) | 00:00:01 | 1 | 28 |
| | 3 | TABLE ACCESS FULL | SALES | 918K | 4486K | 525K (2) | 00:00:01 | 1 | 28 | |
-
Statistics
1 recursive calls
0 db block gets
1635 consistent gets
1619 physical reads
0 redo size
618 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
It is actually a better way to submit MAX and MIN queries separately and merge the results together:
SQL > select max_sold,min_sold from (select max (amount_sold) max_sold from sales) maxt, (select min (amount_sold) min_sold from sales) mint
Execution Plan
Plan hash value: 3650580342
-
| | Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time |
-
| | 0 | SELECT STATEMENT | | 1 | 26 | 6 (0) | 00:00:01 |
| | 1 | NESTED LOOPS | | 1 | 26 | 6 (0) | 00:00:01 |
| | 2 | VIEW | | 1 | 13 | 3 (0) | 00:00:01 |
| | 3 | SORT AGGREGATE | | 1 | 5 |
| | 4 | INDEX FULL SCAN (MIN/MAX) | AMOUNT_SOLD_IDX | 1 | 5 | 3 (0) | 00:00:01 |
| | 5 | VIEW | | 1 | 13 | 3 (0) | 00:00:01 |
| | 6 | SORT AGGREGATE | | 1 | 5 |
| | 7 | INDEX FULL SCAN (MIN/MAX) | AMOUNT_SOLD_IDX | 1 | 5 | 3 (0) | 00:00:01 |
-
Statistics
0 recursive calls
0 db block gets
6 consistent gets
0 physical reads
0 redo size
602 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
2.3 "Top N" query
For the top 10 best-selling items in dollar terms on the sales table, the following query must be wrong:
SQL > select * from sales where rownum select / * top10 subquery * / * from (select cust_id,prod_id,time_id,amount_sold from sales order by amount_sold desc) where rownum
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.