In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces the knowledge of "what are the new features of PostgreSQL 12 GA". Many people will encounter such a dilemma in the operation of actual cases, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
Execution Optimization of SQL
The first important change: the parallelization of re-indexing. For example, in the case of index data corruption, index inflation, index creation options change, invalid index reconstruction and so on, in the previous version, the index reconstruction needed to add a global read-only lock on the table to prevent other sessions from writing. Now, this problem is avoided through a subdivided multi-step transaction operation, as follows: 1. The first step is to open the transaction and create a temporary index. two。 Start inserting data on the temporary index, and note here that if you are rebuilding all the indexes on the table, the corresponding number of temporary indexes will be created at the same time. 3. The current step is to insert the data and then insert the new data generated during the creation of the index. 4. After all the data has been inserted, replace the original index with a temporary thumbnail. 5. Finally, delete the old index to complete the overall operation. The second important change: the addition of the generate column function. In the use of the database, it is inevitable to encounter a column or a few columns of the database table generated by the function. In this case, if the operation is performed in real time every time, then the calculation is expensive, especially when the table is very large. The emergence of generated columns is to solve this problem. Whenever data is inserted into a database table, for generating columns, the corresponding data is generated without explicit input from the user, which, of course, cannot be entered by the user. In the implementation of PG, the processing of the index of the generated column is included. But at present, the realization of this function is not omnipotent, it has a lot of limitations. Here are some of them: 1. Currently, only one row can be calculated. two。 Subqueries are not supported. 3. You cannot use other build columns. 4. Can be used as a partition key. The third important change: optimizer-level processing, the inline optimization of CTE. CTE, which actually refers to the query specified by the with syntax before the main SQL statement, is usually provided to the main query structure as a temporary table. In the previous implementation, the data in CTE is first queried as temporary tables, and then the corresponding operation of the main query is executed. In PostgreSQL 12, an optimization called inline is carried out: if the table specified by the ctw expression is prepared and used once in the main query, then the database will directly use the subquery instead of the pre-query to perform the subsequent query and processing. This is similar to the inline meaning of programming languages such as c. Further optimization through subqueries can greatly improve performance. This feature can also be controlled manually, and inline processing (MATERIALIZED) is also performed for some CTE that do not meet the conditions, or inline processing (NOT MATERIALIZED) is still not used when the conditions are met. The fourth important change: caching the execution plan, which is not so important at present, but may have a very useful function in the future. It is well known that Oracle caches execution plans, while open source databases like MySQL,PostgreSQL are handled by SQL statements every time they are parsed on the spot. Now, in PostgreSQL 12, caching of execution plans is achieved first-although the scope of impact of this feature is currently very limited: only prepare statements are explicitly used, temporary procedures are created, or simply stored procedures PL/pgSQL, otherwise cached execution plans cannot be used, far from caching execution plans like Oracle, where ordinary SQL statements can cache execution plans. However, in the foreseeable future, there is bound to be such an optimization. And this will be what subsequent iterations of PG will need to do. The fifth important change: it's actually a configuration change, but the theme affects SQL execution, so to put it briefly here, JIT is on by default in PostgreSQL 12. With regard to JIT, here is a brief description: the simple calculation in the SQL statement is compiled directly into the machine assembly code for execution, which is much more efficient than the ordinary SQL execution that needs to be called from SQL to C, except for the need for a little more CPU in the SQL parsing phase, there are no other disadvantages, and the benefits of turning on this feature are huge. Configuration optimization in addition to SQL optimization, which is the most concerned topic for developers, for operations and maintenance, PG 12 version also makes a lot of changes. The first is the addition of two new administrative views and a new function:-pg_stat_progress_create_index to view the progress of the index currently being created
Number of blocks executed
Number of rows executed
Use / wait for locks
-pg_stat_progress_cluster to view the current vacuum full/cluster progress
Number of block reads and writes
Number of data entries read and written
-pg_ls_archive_statusdir () lists the contents of the archive status folder. This change allows DBA to have a more detailed understanding of the severe behavior that occurs in the database. The following is a better decision. The second is a change that I think is definitely worth writing a great deal, leaving a wonderful chapter in the history of PG: "kill" the recovery.conf file, and merge the configuration project into the postgresql.conf configuration file. This file almost existed with the advent of PG. Long before PG 9.0, this file was responsible for the playback configuration of redo (WAL/XLOG), so it was called recovery.conf. After 9.0, stream replication appears, and of course, the configuration of stream replication is also put in this file. Then, in fact, this file plays more of the role of streaming replication configurations than data recovery. However, unlike data recovery, which only requires offline operations, stream replication requires online changes in many cases, and recovery.conf does not support reload, which becomes a hassle that needs to be solved. At the beginning of PostgreSQL, after the original project of this file was merged into postgresql.conf, in order to avoid configuration conflicts, PG itself added a mandatory restriction: if it is checked that there is a recovery.conf file in the data directory, the database is not allowed to start. This merge is not only a simple merge, but also involves the renaming and default value changes of many related parameters:-the following parameters allow reload
Archive_cleanup_command
Promote_trigger_file
Recovery_end_command
Recovery_min_apply_delay
-name and default value change
Change the name of trigger_file to promote_trigger_file
Cancel the standby_mode configuration option
Specifying multiple recovery target is not allowed
Restore to last timeline by default (formerly current)
Application name that uses cluster name as the default wal receiver
It is believed that in future versions, after PG master-slave switch, standby can change the master library without restarting, which is not an impossible thing. The third is the daily problem of PG, the optimization of vacuum:
Set VACUUM not to recycle trailing blank pages
Set VACUUM to skip scanning the index
Setting VACUUM skips if it encounters a lock that cannot be acquired immediately
Of course, these settings can greatly reduce the impact of vacuum on the database, but for the future of PG, the better way to solve this problem is of course the new storage engine. Isolated storage engine in practice, MySQL early years of MyISAM, the implementation quality is not good, does not support transactions, table-level read-write locks. However, because of the independent interface of the storage engine, MySQL and other InnoDB,InnoDB have implemented a full set of transaction storage engine, and have now completely replaced the position of MyISAM. PG itself implements the transaction storage engine, and although the requirements of this independent storage engine have been planned many years ago, it is actually split out to do it seriously, which is the thing of this iteration. At present, PG separately deals with the interface of data storage and index storage, and the third party can directly implement the corresponding interface and data structure, so that PG can make use of the new storage engine. There are already two very important storage engines in the community-although they are still a long way from the generation environment, both of them solve the long-standing problems of PG itself and have a bright future. Two very important storage engines are EDB's zheap (under development) and the zedstore (under development) storage engine shared by the Greenplum team. First, let's talk about Zheap. In the storage implementation of PostgreSQL, a part of dirty, vacuum, has become a problem that every operation and maintenance must face in the actual production environment. In zheap, by introducing undo logs, zheap attempts to solve both the vacuum problem and the vacuum freeze (transactional ID rollback) problem caused by 32-bit transactional id. In zheap, there is no processing on the index, execution plan, and so on of heap (later referred to as "pg" native storage engine), but only its data storage part, that is, the undo is split from the data file into undo log. At present, the implementation is as follows: the undo log is written forward (similar to the WAL log), the separate purge process starts to recover from the oldest log of the undo log, and the data changes will be kept in the data pointer of the undo log to facilitate the need to query "old" data. In this way, you can avoid the expansion of data files and the cost of vacuum's full table scan. Zedstore represents a different direction: OLAP. What greenplum deals with is a MPP data warehouse, and in a data warehouse, scanning specific columns of a table is much more common than scanning all columns at the same time, so zedstore is designed to be a column storage database. In the implementation of zedstore, each entry has a virtual primary key called tid, and a column or columns of the table are stored in the B-tree index using tid as the primary key. By supporting tid to multi-column indexes, it is also equivalent to implementing "row-column hybrid storage". Another important implementation of zedstore is compression. When zedstore data is stored, it only compresses the data, not the block metadata, although it sacrifices a certain proportion of the compression ratio (considering the size of the data block, it may not cost much). But the benefit is obvious: the data blocks are stored in the shared pool in the form of compression, and the users will decompress the data they need-MySQL InnoDB compression, in contrast, is the compression of the whole data block level, so in the shared pool, you have to save both the compressed and uncompressed versions of the data blocks, consuming precious memory in vain. " This is the end of what are the new features of PostgreSQL 12 GA. Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.