In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Original text link http://www.oracle.com/technetwork/database/bi-datawarehousing/twp-bp-for-stats-gather-12c-1967354.pdf
Translator Yang Yuhang of science and technology
When to collect statistics
In order to choose the best execution plan, the optimizer must have access to representative statistics. Representative statistics need not be up-to-date, but a set that helps the optimizer determine the number of rows that can be returned for each operation in the execution plan.
Automatic statistics collection task
Oracle collects statistics for all missing or expired objects in the database during predefined maintenance windows (weekday 10pm to 2am and weekend 6am to 2am). You can change the maintenance window in Oracle Enterprise Manager or using the DBMS_SCHEDULER and DBMS_AUTO_TASK_ADMIN software packages.
Figure 3: maintenance window for changing the run time of an automatic statistics collection job
If you already have a working statistics collection program, or if you want to disable the automatic statistics collection task, you can use the following command:
Begin
Dbms_auto_task_admin.disable (
Client_name= > 'auto optimizer stats collection'
Operation= > null
Window_name= > null)
End
/
Manual statistical collection
If you plan to maintain optimizer statistics manually, you need to determine when to collect statistics. Based on out-of-date information, you can decide when statistics should be collected, and you can determine when to collect statistics based on the invalidity of statistics, just like automatic jobs, or based on the time new data is loaded in your environment. If there is no significant change in the basic data, it is not recommended to re-collect statistics, as this will unnecessarily waste system resources.
If data is loaded into your environment only during predefined ETL or ELT jobs, you can schedule statistics collection operations as part of this process.
Collect statistics online
In Oracle Database 12c, online collection statistics "piggybacks" are collected as part of direct path data loading operations, such as creating tables like CTAS and inserting data in IAS. Statistics are collected as part of the data load operation, which means that statistics can be provided immediately after the data is loaded without the need for additional full table scans.
Figure 4: online collection statistics provide table and column statistics collection for the newly created sales2 table
Collecting statistics online does not collect histogram and index statistics because these types of statistics require additional data scanning, which can have a significant impact on performance when the data is loaded.
If you want to collect histogram or index statistics instead of re-collecting basic column statistics, use the new selection "OPTIONS" parameter in DBMS_STATS.GATHER_TABLE_STATS to set to GATHER AUTO.
Note that for performance reasons, GATHER AUTO uses sample data in the row instead of all the data in the table when generating the histogram.
Figure 5: set the option to GATHER AUTO to create a histogram on the SALES2 table without involving basic statistics
Note that the column "HISTOGRAM_ONLY" indicates that the histogram was collected without re-collecting the basic statistics of the column. There are two ways to confirm that online collection of statistics occurs: one way is to check the execution plan to see if "OPTIMIZER STATISTICS GATHERING" appears in the execution plan, and the other is to see if the status of the notes field in the USER_TAB_COL_ statistics table is stats_on_load.
Figure 6: execution plan for online statistics collection operation
In order to reduce the impact on the performance of direct path loading, online statistics collection is triggered only when the loaded object is empty. To ensure that statistics are collected online when a new partition of an existing table is loaded, specify the partition explicitly using extended syntax. In this case, partition-level statistics are created, but global-level (table-level) statistics are not updated. If incremental statistics are enabled on a partitioned table, a "synopsis" is created in the data load operation.
You can use HINT: NO_GATHER_OPTIMIZER_STATISTICS to disable a single SQL statement to use online collection of statistics.
Incremental statistics and partition exchange data loading
Statistics collection for partitioned tables includes table-level (global) and (sub) partition-level statistics. If the "incremental" preference of the partitioned table is set to true, then the GRANULARITY in the DBMS_STATS.GATHER_*_STATS parameter contains global,ESTIMATE_PERCENT set to AUTO_SAMPLE_SIZE,Oracle will scan only those added or modified partitions, not the entire table, resulting in accurate statistics at all global levels. Incremental global statistics work by storing "synopsis" for each partition in the table, and "synopsis" is statistical metadata for that partition and the columns in the partition, eliminating the need to scan the entire table. Aggregating partition-level statistics and "synopses" for each partition accurately generates global statistics, eliminating the need to scan the entire table.
When a new partition is added to the table, you only need to collect statistics for the new partition. Using the new partition "synopsis" and the existing partition's "synopses", the table-level statistics will be calculated automatically and accurately.
Note that when incremental statistics are enabled, partition statistics are not aggregated from subpartition statistics.
If you are using partition swapping and want to take advantage of incremental statistics, you need to set the DBMS_STATS preference INCREMENTAL_LEVEL on the non-partitioned table to determine that it will be used during partition swapping. Set INCREMENTAL_LEVEL to TABLE, and when statistics are collected on it, Oracle automatically creates a "synopsis", and this table-level "synopsis" becomes a partition-level "synopsis" after the partition exchange.
However, if there are many online transactions in your environment that insert a small amount of data during the day, you need to determine when the statistics are out of date and then trigger an automatic statistics collection task.
If you plan to rely on the stale_ stats column in user _ tab _ statistics to determine whether the statistics are out of date, you should note that this information is updated only on a daily basis.
If you need to know in time what DML is happening on your table, you need to look at the USER_TAB_MODIFICATIONS table, which lists the number of INSERT, UPDATE, and DELETE that have occurred on each table, whether the table has been TRUNCATED (TRUNCATE column) and calculates whether it is out of date.
Third, you should notice that this information is automatically updated from memory, and if you need the latest information, you need to use DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO to manually refresh the information.
Prevent "out of range" condition
Whether you use the automatic statistics collection task or collect statistics manually, if the end user starts querying newly inserted data before collecting statistics, even if less than 10% of the rows in the table are changed, a suboptimal execution plan may be obtained as a result of stale statistics, one of the most common cases occurs when the value provided in the where clause predicate exceeds the range of [min, max] column statistics That's what happens. This is often referred to as an "out of scope" error. In this case, the optimizer proportionally allocates the selectivity according to the distance between the predicate values and the maximum value (assuming that the value is higher than the maximum value), that is, the maximum or minimum value, thereby reducing the selectivity.
This scenario and range partitioning tables are very common. A new partition is added to an existing range partition table, and rows are inserted into the partition. Users start querying this new data before this new partition collects statistics. For partitioned tables, you can use DBMS_STATS.COPY_TABLE_STATS (starting with Oracle database 10.2.0.4) to prevent the "out of range" condition. This procedure copies the statistics of the representative source [sub] partition to the newly created and empty target [sub] partition.
It also copies statistics for related objects: columns, local (partitioned) indexes, and so on. The highest limit value is taken as the maximum value of the partition column and the highest limit value of the previous partition as the minimum value of the partition column.
The copied statistics should be treated only as a temporary solution until accurate statistics for the partition can be collected. Copied statistics should not be used as an alternative to actually collecting statistics.
By default, DBMS_STATS.COPY_TABLE_STATS is only used to adjust partition statistics, not global or table-level statistics. If you want to update the global level statistics of the partition column as part of the copy, you need to set the parameter of DBMS_STATS.COPY_TABLE_STATS to 8.
For non-partitioned tables, you can manually set the maximum value of the column using DBMS_STATS.SET_COLUMN_STATS. However, it is generally not recommended to use this method instead of actually collecting statistics.
Improve the quality of statistical information
Good quality statistics are critical to generating the best SQL execution plan, but sometimes the quality of the statistics is poor, and this situation may be ignored. For example, the "inherited" system may use scripts that the database administrator cannot understand, or may be understandable but unwilling to change them. However, because Oracle continues to improve its statistics collection capabilities, best practices and recommendations may be ignored. For these reasons, the Oracle database 12cR2 includes a new optimization statistics consultant to help you improve the quality of the statistics in the database. The optimized statistical consultant analyzes the information in the data dictionary, evaluates the quality of the statistics, and discovers how to collect statistics. It will report poor quality and missing statistics and make recommendations to address these problems.
The principle of its operation is to apply best practice rules to identify potential problems. These issues are reported as a series of results, which in turn can produce specific recommendations. These recommendations can be implemented automatically through actions (executed by the database administrator immediately or through automatically generated scripts).
Figure 7: optimization Statistics Advisor
The optimize Statistics Advisor task is run automatically in the maintenance window, but can also be run on demand. You can view html or text reports generated by tuning Statistics Advisor at any time, and you can perform actions at any time. Figure 8 illustrates an example of a specific rule that leads to finding, advising, and resolving problems:
Figure 8: examples of rules, lookups, suggestions, and actions.
Optimize the statistics consultant to collect and store the information in the data dictionary. Its performance overhead is low because it analyzes the collected statistics (which have been saved in the data dictionary) and does not re-analyze the statistics of objects that have been stored in the application Schema.
Figure 9: reading the data dictionary, performing tasks through filters and storing results
When the task is complete, you can generate the report in html or text format, or you can create an SQL script.
Figure 10: report the advisor task and generate the action SQL script.
Viewing reports generated by automated tasks is very simple:
Select dbms_stats.report_advisor_task ('auto_stats_advisor_task') as report from dual
Alternatively, users with permission to use consultants can perform tasks manually and report the results using the following three steps:
DECLARE
Tname VARCHAR2 (32767): = 'demo';-- task name
BEGIN
Tname: = dbms_stats.create_advisor_task (tname)
END
/
DECLARE
Tname VARCHAR2 (32767): = 'demo';-- task name
Ename VARCHAR2 (32767): = NULL;-- execute name
BEGIN
Ename: = dbms_stats.execute_advisor_task (tname)
END
/
SELECT dbms_stats.report_advisor_task ('demo') AS report
FROM dual
Actions generated by tuning Statistics Advisor can be implemented immediately:
DECLARE
Tname VARCHAR2 (32767): = 'demo';-- task name
Impl_result CLOB;-report of
Implementation
BEGIN
Impl_result: = dbms_stats.implement_advisor_task (tname)
END
/
In addition, Oracle12c Real Application Testing includes practical performance assurance features, such as SQL performance consultant quick inspection. See the Oracle white paper, "Database 12c Real Application Testing Overview" for more details (see Resources on page 21).
Quickly collect statistics
As data grows and maintenance windows shrink, it is more important than ever to collect statistics in a timely manner. Oracle provides a variety of ways to accelerate statistics collection, from parallelizing statistics collection to generating statistics instead of collecting statistics.
Use the parallel method
Parallelism can be used for statistical collection in several ways
? Use the DEGREE parameter
? Concurrent statistical collection
? Combine DEGREE with concurrent collection
Use parallel parameters
The "DEGREE" parameter in DBMS_STATS is used to control the number of parallel execution processes when collecting statistics. By default, Oracle uses the same number of parallel server processes as specified by the attribute (parallelism) of the table in the data dictionary. All tables with this property in the Oracle database are set to 1 by default. In order to speed up the collection of statistics, you can specify this parameter when collecting statistics on large tables, or you can set degree to auto_degree;Oracle that will automatically determine the appropriate number of parallel server processes that should be used to collect statistics based on the size of the object. This value can range from 1 (serial execution) (small objects) to DEFAULT_DEGREE (PARALLEL_THREADS_PER_CPU XCPU_COUNT) (larger objects).
Figure 11: using parallelism with the DEGREE parameter
You should note that setting the DEGREE of the partition table means that multiple parallel server processes will be used to collect statistics for each partition, but not on different partitions at the same time. Statistics will be collected once on each partition.
Concurrent statistical collection
The concurrent statistics collection feature can collect statistics concurrently on multiple tables and multiple (sub) partitions in Schema (or Database). The overall time required to collect statistics can be reduced by allowing Oracle to take full advantage of the multiprocessor environment while collecting statistics for multiple tables and (sub) partitions.
The collection of concurrency statistics is controlled by the global option CONCURRENT. You can set MANUAL,AUTOMATIC, ALL, and OFF. The default is OFF. When CONCURRENT is enabled, Oracle uses the Job Scheduler and advanced queue components to create and manage the concurrent execution of multiple statistics collection jobs.
Call DBMS_STATS.GATHER_TABLE_STATS on the partitioned table, and when CONCURRENT is set to MANUAL, Oracle creates a separate statistics collection job for each (sub) partition in the table. How many jobs are executed concurrently and how many jobs are queued are determined by the number of processes in the available job queue (JOB_QUEUE_PROCESSES initialization parameters, each node in the RAC environment) and available system resources. As the running job completes, more jobs will appear and execute until all (child) partitions have collected their statistics.
If you use DBMS_STATS.GATHER_DATABASE_STATS, DBMS_STATS.GATHER_SCHEMA_STATS, or DBMS_STATS.GATHER_DICTIONARY_STATS to collect statistics, Oracle creates a separate statistics collection job for each non-partitioned table and for each (child) partition of the partitioned table. Each partition table will also have a coordination job to manage its (sub) partition job. The database then runs as many concurrent jobs as possible and queues the rest until the job execution is complete. However, to prevent possible deadlock situations, multiple partition tables cannot be processed at the same time. Therefore, if some jobs have been run for partitioned tables, other partitioned tables in Schema (or database or dictionary) will be queued until the current job completes. There is no such restriction for non-partitioned tables.
Figure 12 shows DBMS_STATS.GATHER_SCHEMA_STATS creating jobs at different levels. On Schema:sh, Oracle will create a statistics collection job for each non-partitioned table (level 1 in figure 12)
? CHANNELS
? COUNTRIES
? TIMES
Oracle creates one coordination job for each partition table: SALES and COSTS, and then creates a statistics collection job for each partition in the SALES and COSTS tables (level 2 in figure 12).
Figure 12: job list when concurrency statistics are collected on sh
If the DEGREE parameter is specified, each individual statistics collection job can also be executed in parallel. If the table, partition table, or child partition table is very small or empty, Oracle can automatically combine objects and other small objects into a separate job to reduce the overhead of job maintenance.
Configure concurrent statistics collection
By default, the concurrency setting for statistics collection is turned off. It can be opened as follows:
Exec dbms_stats.set_global_prefs ('concurrent',' all')
You also need some additional privileges and general permissions to collect statistics. The user must have the following Job Scheduler and AQ permissions:
? CREATE JOB
? MANAGE SCHEDULER
? MANAGE ANY QUEUE
The sysaux tablespace should be online because the job program stores its internal tables and views in the sysaux tablespace. Finally, the JOB_QUEUE_PROCESSES parameter should be set to make full use of all system resources available for the statistical collection process. If you do not plan to use parallel execution, set job _ queue _ process to the total number of 2*CPU cores (per node in a RAC environment). Be sure to set this parameter (alter system...) at the system level Or in the init.ora file) rather than at the session level (alter session).
If you want parallel execution as part of concurrency statistics collection, you should disable parallel adaptive multi-user:
ALTER SYSTEM SET parallel_adaptive_multi_user=false
The resource manager must also be activated, for example:
ALTER SYSTEM SET resource_manager_plan = 'DEFAULT_PLAN'
It is also recommended that parallel statement queues be enabled. This requires activating the resource manager and creating a temporary resource plan in which the consumer group "others_groups" is enabled.
By default, Explorer is activated only during the maintenance window. The following script illustrates one way to create a temporary resource plan (pqq _ test) and enables the resource manager to execute the plan.
-- connect as a user with dba privileges
Begin
Dbms_resource_manager.create_pending_area ()
Dbms_resource_manager.create_plan ('pqq_test',' pqq_test')
Dbms_resource_manager.create_plan_directive (
'pqq_test'
'OTHER_GROUPS'
'OTHER_GROUPS directive for pqq'
Parallel_target_percentage = > 90)
Dbms_resource_manager.submit_pending_area ()
End
/
ALTER SYSTEM SET RESOURCE_MANAGER_PLAN = 'pqq_test' SID='*'
If you want the automatic statistics collection task to take advantage of concurrency, set CONCURRENT to AUTOMATIC or ALL. A new AUTOTASK $consumer group has been added to the resource manager plan used in the maintenance window to ensure that the collection of concurrency statistics does not use too many system resources.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.