Statistics on Column Groups Analysis in Oracle 12C 07/13 Update SLTechnology News&Howtos

Statistics on Column Groups Analysis in Oracle 12C

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "Statistics on Column Groups Analysis in Oracle 12C". In daily operation, I believe many people have doubts about Statistics on Column Groups analysis in Oracle 12C. The editor consulted all kinds of data and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "Statistics on Column Groups Analysis in Oracle 12C"! Next, please follow the editor to study!

Statistics on Column Groups

Individual column statistics are useful for determining the selectivity of a single predicate in a where clause. However, when the where clause contains multiple predicates from different columns of the same table, a single column statistics cannot show the relationship between columns. Using column groups (column group) is used to solve this problem. The optimizer calculates the selectivity of predicates separately and then merges them. However, if there is an association between individual columns, the optimizer will not consider it when evaluating the cardinality, and the optimizer will evaluate the cardinality by multiplying the selectivity of each table predicate by the number of rows.

The following statement queries the dba_tab_col_statistics table to display statistics about the columns cust_state_province and country_id in the sh.customers table.

SQL > COL COLUMN_NAME FORMAT a20SQL > COL NDV FORMAT 999SQL > SELECT COLUMN_NAME, NUM_DISTINCT AS "NDV", HISTOGRAM 2 FROM DBA_TAB_COL_STATISTICS 3 WHERE OWNER = 'SH' 4 AND TABLE_NAME =' CUSTOMERS' 5 AND COLUMN_NAME IN ('CUST_STATE_PROVINCE',' COUNTRY_ID') COLUMN_NAME NDV HISTOGRAM- CUST_STATE_PROVINCE 145 FREQUENCYCOUNTRY_ID 19 FREQUENCY

The following statement queries the number of customers living in California 3341:

SQL > SELECT COUNT (*) 2 FROM sh.customers 3 WHERE cust_state_province = 'CA'; COUNT (*)-3341

To show the query execution of the number of customers whose state is CA,country_id 52790 (USA)

SQL > EXPLAIN PLAN FOR 2 SELECT * 3 FROM sh.customers 4 WHERE cust_state_province = 'CA' 5 AND country_id=52790;Explained.SQL > SELECT * FROM TABLE (DBMS_XPLAN.DISPLAY) PLAN_TABLE_OUTPUT----Plan hash value: 2008213504 Murray- -| Id | Operation | Name | Rows | Bytes | Cost (% CPU) | Time |- -| 0 | SELECT STATEMENT | | 1115 | 205k | 423 (1) | 00:00:01 | | * 1 | TABLE ACCESS FULL | CUSTOMERS | 1115 | 205k | 423 (1) | 00:00:01 |- -Predicate Information (identified by operation id):-PLAN_TABLE_OUTPUT- -1-filter ("CUST_STATE_PROVINCE" = 'CA' AND "COUNTRY_ID" = 52790) Note--automatic DOP: Computed Degree of Parallelism is 1 because of parallel threshold17 rows selected.

Based on the statistics of single-column country_id and cust_state_province columns, the optimizer estimates that the number of California customers living in USA is 1115, but there are actually 3341, but the optimizer does not know it, so the number of rows returned is reduced through all predicates, thus greatly reducing the evaluation base. You can let the optimizer know the real relationship between column country_id and cust_state_province by collecting column group statistics.

Automatic and manual column group statistics

The Oracle database can create column group statistics automatically or manually. The optimizer can use SQL execution plan instructions to generate a better execution plan. If the dbms_stats reference parameter auto_stat_extensions is set to ON (the default is OFF), the SQL execution plan instruction can automatically trigger to create column group statistics based on the usage of predicates in the workload. The auto_stat_extensions can be set through the set_table_prefs,set_global_prefs or set_schema_prefs procedure.

You can use dbms_stats to do the following when you want to manage column group statistics manually:

. Probe train group

. Create previously detected column groups

. Manually create column groups and collect column group statistics

Column Group Statistics user Interface

Several dbms_stats program units have reference parameters related to column groups

The seed_col_usage procedure iterates over the SQL statements in the specified workload, compiles them, and then looks at the usage information of the columns that appear in the statement predicates. In order to determine the appropriate column group, the database must observe a representative workload. There is no need to run the query itself during monitoring. Explain plan can be performed on queries that take a long time in the workload to ensure that the database records the column group information used by these queries.

The report_col_usage function, which generates a report listing the filter predicates seen in the workload, joining predicates and columns in the group by clause. You can use this function to check the usage information for the columns recorded in the specified table.

Create_extended_stats function to create an extension, which can be a column group or an expression. The database collects statistics for the extension when the user manually or automatically collects statistics from the table.

Auto_stat_extensions references the parameter, which controls the automatic creation of extensions, including column groups, which are set using set_table_prefs,set_schema_prefs or set_global_prefs when optimizer statistics are collected. When auto_stat_extensions is set to off (the default), the database does not automatically create column group statistics. In order to create an extension, you must execute the create_extended_stats function or explicitly specify the extension statistics in the method_opt parameter in dbms_stats API. When auto_stat_extensions is set to ON, an SQL execution plan instruction can trigger the automatic creation of column group statistics based on the usage information of the columns in the predicate in the workload.

Detect useful column groups for a specific workload

You can use dbms_stats.seed_col_usage and report_col_usage to determine which table requires column groups based on a specific amount of work. This technique is useful when you don't know what kind of extended statistics you need to create. This technique does not work for extending statistics.

Suppose the following conditions exist:

. Query the sh.customers_ test table (created with the customers table) and use the country_id and cust_state_province columns in the predicate but the cardinality evaluation is incorrect.

. You want the database monitoring workload for 5 minutes (300 seconds).

. If you want the database to automatically determine which column groups are needed.

In order to detect column groups, you need to do the following:

1. Start SQL*Plus or SQL Developer and log in to the database as user sh

two。 Create the table customers_test and collect statistics:

SQL > DROP TABLE customers_test;Table dropped.SQL > CREATE TABLE customers_test AS SELECT * FROM customers;Table created.SQL > EXEC DBMS_STATS.GATHER_TABLE_STATS (user, 'customers_test'); PL/SQL procedure successfully completed.

3. Enable workload monitorin

In different SQL*Plus sessions, log in as the sys user and execute the following PL/SQL program to enable monitoring for 300 seconds:

SQL > BEGIN 2 DBMS_STATS.SEED_COL_USAGE (null,null,300), 3 END; 4 / PL/SQL procedure successfully completed.

4. Use the user sh to parse the execution plan of the two queries in the case of using the workload.

SQL > EXPLAIN PLAN FOR 2 SELECT * 3 FROM customers_test 4 WHERE cust_city = 'Los Angeles' 5 AND cust_state_province =' CA' 6 AND country_id = 52790 exploring. SQL > SELECT PLAN_TABLE_OUTPUT 2 FROM TABLE (DBMS_XPLAN.DISPLAY ('plan_table', null,'basic rows')) Plan hash value: 2112738156-| Id | Operation | Name | Rows |- -| 0 | SELECT STATEMENT | | 1 | | 1 | TABLE ACCESS FULL | CUSTOMERS_TEST | 1 |-- 8 rows selected.SQL > EXPLAIN PLAN FOR 2 SELECT country_id Cust_state_province, count (cust_city) 3 FROM customers_test 4 GROUP BY country_id, cust_state_province Explained.SQL > SELECT PLAN_TABLE_OUTPUT 2 FROM TABLE (DBMS_XPLAN.DISPLAY ('plan_table', null,'basic rows')) Plan hash value: 1820398555-| Id | Operation | Name | Rows |- -| 0 | SELECT STATEMENT | | 1949 | | 1 | HASH GROUP BY | | 1949 | | 2 | TABLE ACCESS FULL | CUSTOMERS_TEST | 55500 |-9 rows selected.

The first execution plan shows a cardinality of 1 and the query returns 932 rows of records, and the second execution plan shows a radix of 1949 and the query returns 145 rows of records.

5. Optional operation to check the usage information for the columns recorded in the table customers_test

SQL > SET LONG 100000SQL > SET LINES 120SQL > SET PAGES 0SQL > SELECT DBMS_STATS.REPORT_COL_USAGE (user, 'customers_test') 2 FROM DUAL LEGEND:.EQ: Used in single table EQuality predicateRANGE: Used in single table RANGE predicateLIKE: Used in single table LIKE predicateNULL: Used in single table is (not) NULL predicateEQ_JOIN: Used in EQuality JOIN predicateNONEQ_JOIN: Used in NON EQuality JOIN predicateFILTER: Used in single table FILTER predicateJOIN: Used in JOIN predicateGROUP_BY: Used in GROUP BY expression. ... # # COLUMN USAGE REPORT FOR SH.CUSTOMERS_TEST..1. COUNTRY_ID: EQ2. CUST_CITY: EQ3. CUST_STATE_PROVINCE: EQ4. (CUST_CITY, CUST_STATE_PROVINCE, COUNTRY_ID): FILTER5. (CUST_STATE_PROVINCE, COUNTRY_ID): GROUP_BY####

In the above report, the first three columns are the three columns used in the medium value predicate of the first monitoring query:

... WHERE cust_city = 'Los Angeles'AND cust_state_province =' CA'AND country_id = 52790

All three are now listed in the same where clause, so the report shows them as a group. In the second query, two columns appear in the group by clause, so the report marks them as group_by. Column groups in filter and group_by are candidates for column groups.

Create detected column groups under workload monitoring

You can use the dbms_stats.create_extended_stats function to create a column group for executing the column groups detected by dbms_stats.seed_col_usage, as follows:

1. To create a column group for the customers_test table based on the column usage information captured during the monitoring window, execute the following query

SQL > SELECT DBMS_STATS.CREATE_EXTENDED_STATS (user, 'customers_test') FROM DUAL # EXTENSIONS FOR SH.CUSTOMERS_TEST..1. (CUST_CITY, CUST_STATE_PROVINCE, COUNTRY_ID): SYS_STUMZ$C3AIHLPBROI#SKA58H_N created2. (CUST_STATE_PROVINCE, COUNTRY_ID): SYS_STU#S#WF25Z#QAHIHE#MOFFMM_ created####

The database will create two column groups for the customers_test table: a column group is a filter predicate and a column group is a group by operation.

two。 Re-collect table statistics

SQL > EXEC DBMS_STATS.GATHER_TABLE_STATS (user,'customers_test'); PL/SQL procedure successfully completed.

3. Query the user_tab_col_statistics view with the user sh to determine which additional statistics the database created:

SQL > SELECT COLUMN_NAME, NUM_DISTINCT, HISTOGRAM 2 FROM USER_TAB_COL_STATISTICS 3 WHERE TABLE_NAME = 'CUSTOMERS_TEST' 4 ORDER BY 1 COUNTRY_ID 19 FREQUENCYCUST_CITY 620 HYBRIDCUST_CITY_ID 620 NONECUST_CREDIT_LIMIT 8 NONECUST_EFF_FROM 1 NONECUST_EFF_TO 0 NONECUST_EMAIL 1699 NONECUST_FIRST_NAME 1300 NONECUST_GENDER 2 NONECUST_ID 55500 NONECUST_INCOME_LEVEL 12 NONECUST_LAST_NAME 908 NONECUST_MAIN_PHONE_NUMBER 51344 NONECUST_MARITAL_STATUS 11 NONECUST_POSTAL_CODE 623 NONECUST_SRC_ID 0 NONECUST_STATE_PROVINCE 145 FREQUENCYCUST _ STATE_PROVINCE_ID 145 NONECUST_STREET_ADDRESS 49900 NONECUST_TOTAL 1 NONECUST_TOTAL_ID 1 NONECUST_VALID 2 NONECUST_YEAR_OF_BIRTH 75 NONESYS_STU#S#WF25Z#QAHIHE#MOFFMM_ 145 NONESYS_STUMZ$C3AIHLPBROI#SKA58H_N 620 HYBRID25 rows selected.

The above query shows the two column group names returned by the dbms_stats.create_extended_stats function. The column groups created for the CUST_CITY, CUST_STATE_PROVINCE, and COUNTRY_ID columns have a histogram statistics of type HYBRID.

4. Parse the execution plan of the previous two query statements again

SQL > EXPLAIN PLAN FOR 2 SELECT * 3 FROM customers_test 4 WHERE cust_city = 'Los Angeles' 5 AND cust_state_province =' CA' 6 AND country_id = 52790 exploring. SQL > SELECT PLAN_TABLE_OUTPUT 2 FROM TABLE (DBMS_XPLAN.DISPLAY ('plan_table', null,'basic rows')) Plan hash value: 2112738156-| Id | Operation | Name | Rows |- -| 0 | SELECT STATEMENT | | 874 | 1 | TABLE ACCESS FULL | CUSTOMERS_TEST | 874 |-8 rows selected.SQL > EXPLAIN PLAN FOR 2 SELECT country_id Cust_state_province, count (cust_city) 3 FROM customers_test 4 GROUP BY country_id, cust_state_province Explained.SQL > SELECT PLAN_TABLE_OUTPUT 2 FROM TABLE (DBMS_XPLAN.DISPLAY ('plan_table', null,'basic rows')) Plan hash value: 1820398555-| Id | Operation | Name | Rows |- -| 0 | SELECT STATEMENT | 145 | 1 | HASH GROUP BY | | 2 | TABLE ACCESS FULL | CUSTOMERS_TEST | 55500 |-9 rows selected.

The cardinality of the first query evaluation is 874, the number of records to be returned is 932, the cardinality of the second query evaluation is 145, and the number of records to be returned is 145, so the number of records evaluated by the cardinality is very close to the actual returned records. this is the benefit of column group statistics.

Manually create and collect column group statistics

In some cases, you may know the column group you want to create. The method_opt parameter of the dbms_stats.gather_table_stats function automatically creates and collects column group statistics. You can create a new column group by using for columns to specify a column group.

Suppose the following conditions exist:

. You want to create a column group for the cust_state_province and country_id columns on the sh.customers table.

. Want to collect statistics on sh.customers tables and new column groups.

Manually create and collect column group statistics to do the following:

1. Start SQL*Plus and log in to the database as sh.

two。 Use the following PL/SQL program to create column groups and collect statistics:

SQL > BEGIN 2 DBMS_STATS.GATHER_TABLE_STATS ('sh','customers', 3 METHOD_OPT = >' FOR ALL COLUMNS SIZE SKEWONLY'| | 4 'FOR COLUMNS SIZE SKEWONLY (cust_state_province,country_id)'); 5 END; 6 / PL/SQL procedure successfully completed.

Show column group information

To get the column group name, you can use the dbms_stats.show_extended_stats_name function or the database view. You can also use views to obtain information such as the number of distinct values and whether the column group has histogram statistics.

1. Start SQL*Plus and log in to the database as sh.

two。 To get the column group name, execute the following PL/SQL program

SQL > SELECT SYS.DBMS_STATS.SHOW_EXTENDED_STATS_NAME ('sh','customers', 2' (cust_state_province,country_id)') col_group_name 3 FROM DUAL;COL_GROUP_NAME--SYS_STU#S#WF25Z#QAHIHE#MOFFMM_

Query user_stat_extensions view

SQL > SELECT EXTENSION_NAME, EXTENSION 2 FROM USER_STAT_EXTENSIONS 3 WHERE TABLE_NAME='CUSTOMERS' EXTENSION_NAME EXTENSION -SYS_STU#S#WF25Z#QAHIHE#MOFFMM_ ("CUST_STATE_PROVINCE" "COUNTRY_ID")

3. Query the number of distinct values of the column group created and see if the histogram has been created

SQL > SELECT e.EXTENSION col_group, t.NUM_DISTINCT, t.HISTOGRAM 2 FROM USER_STAT_EXTENSIONS e, USER_TAB_COL_STATISTICS t 3 WHERE e.EXTENSION_NAME=t.COLUMN_NAME 4 AND e.TABLE_NAME=t.TABLE_NAME 5 AND t.TABLENAMEN customers COL_GROUP NUM_DISTINCT HISTOGRAM -("CUST_STATE_PROVINCE" "COUNTRY_ID") 145 FREQUENCY

Delete column group

You can use the dbms_stats.drop_extended_stats function to delete a column group from a table

SQL > BEGIN 2 DBMS_STATS.DROP_EXTENDED_STATS ('sh',' customers', 3'(cust_state_province, country_id)'); 4 END; 5 / PL/SQL procedure successfully completed. At this point, the study of "Statistics on Column Groups Analysis in Oracle 12C" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.