In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)06/01 Report--
Pt-archiver
Brief introduction
Save some lines in the MySQL table to another table or file
Summary
1. Usage:
Pt-archiver [optional]-- source DSN-- where WHERE
two。 Detailed introduction
In order to minimize the impact on query performance in the OLTP environment, I designed this tool to have a low performance impact and a positive impact.
In the case of taking out and archiving the old data in small batches and small batches.
This tool relies on plug-in principles to achieve scalability, you can add more localized features to this tool, such as extension
More complex rules, or create a data warehouse while extracting data.
Tip: find the first row, and then look back for more rows of data based on some indexes, the conditions contained in the last where section of the command
Instead of scanning the entire table, you should be able to rely on the index to find the data you satisfy.
You can specify the I parameter (index) in the-- source declaration, which is critical for operations that want to get rows of data based on the index.
Use-- dry-run to declare that only the corresponding SQL statements are generated after the tool is executed, without actually manipulating the data, and then to the generated
The SQL statement performs an explain operation to check its index usage. You can further check the handle operation type during the execution of the statement
Make sure the statement does not have a full table scan (handle operation type: read,read_rnd_next,delete, etc., underlying atomic operations)
Parameters.
Note:
At least specify the-- dest,--file, or-- purge parameter
-- ignore and-- replace mutually exclusive
-- txn-size and-- commit-each mutually exclusive
-- low-priority-insert and-- delay-insert mutually exclusive
-- share-lock and-- for-update mutually exclusive
-- analyze and-- optimize mutually exclusive
-- no-ascend and-- no-delete mutually exclusive
If you are in COPY mode, the parameter value of-- dest inherits the value of-- source parameter by default.
-- analyze
After transferring the data, run ANALYZE TABLE to optimize the index information of the table.
-- analyze=d optimizes dest index information
-- analyze=s optimizes source index information
-- analyze=ds optimizes index information for source and target tables
-- no-ascend
Do not use incremental index optimization feature
By default, the pt-archiver tool uses the incremental indexing feature to optimize batch repetitive select operations, that is, when the next batch of data is archived
Starting from the location of the index at the end of the last batch of archives, you no longer need to scan from scratch to the location where you need to archive, but in multi-column composite indexes
The effect may be reduced in the case of
-- ascend-first
Incremental optimization using only the first column of the composite index, related to the above parameters, in cases where the composite index or where conditions are complex
Using only the first column of the composite index for incremental optimization is better than the above method that does not use the incremental index optimization feature at all
-- ask-pass
Enter the password interactively to prevent the development and testing of the peep screen.
-- buffer
Disable automatic refresh of the cache to the file specified by the-- file parameter, store the result set in the operating system cache, and flush the cache when committed.
(originally, each line was refreshed to the file once). But using operating system caching is an unreliable factor and may result in accidental loss
data.
When-- buffer is about 5-15%, the performance of this tool can be improved.
-- txn-size
Specifies how many rows of data a transaction processes. The default is 1. Set to 0, do not use batch feature, automatically commit transactions.
When the tool finishes processing the data that sets the number of rows, it commits the transaction to both the source and target libraries, and then refreshes to the-- file parameter.
This parameter in the file has a great impact on the performance of the source database, so the data should be taken into account when archiving the busy online OLTP server.
Extract performance and transaction commit performance. Increasing the value of this parameter can improve the performance of data extraction, but it will lead to the expansion of row lock range and even
Deadlock. Reducing the value of this parameter can alleviate the above situation, but too small will lead to the increase of transaction volume and the delivery of online business.
It can make a difference.
Author's own test: on PC, when set to 500, every 1K line takes only 2 seconds, but when set to 0, every 1K row is automatically committed.
It took nearly 38 seconds. Using a non-transactional engine, it is recommended to set it to 0.
-- commit-each
Each set of data is submitted once, and the txn-size parameter is automatically masked. You can use-- limit to control the total number of rows of operations.
If you want to archive a very large table, when limit=1000 plus txn-size=1000 is used at the same time, the tool opens a transaction connection
Read all the 999 rows of data that satisfy where, but since the tool itself does not know how much data is satisfied, it is possible
The whole table is scanned to the last row before the transaction is committed, so use the-- commit-each parameter to commit each group
Once can effectively avoid the above dilemma.
-- bulk-delete
Batch deletion, with a batch of data rows completed with a DELETE statement. It also means-- commit-each.
Usually, the method of deleting data is to delete data line by row according to the primary key, and batch deletion will have a significant speed under the condition of simple WHERE.
Improve.
This option defers the delete operation until all rows of data to be deleted have been collected. If there is a trigger before deletion, it will not be triggered.
(before delete), the trigger before the multi-line deletion will be triggered (before bulk delete)
[no] bulk-delete-limit
Batch deletion limit is enabled by default
By default, the-- bulk-delete parameter appends a-- limit parameter, and in some cases, you can use the
-- no-bulk-delete-limit ignores it, but still needs to add the-- limit parameter manually. This parameter is not recommended
-- bulk-insert
Use the LOAD DATA INFILE approach (in combination with bulk-delete and commit-each)
Compared to line-by-line insertion, by creating a temporary file for each batch of data, the data is first written to the temporary file, when a batch of data is obtained.
After picking it up, the import operation will be carried out.
To prevent data from being deleted line by line, but the data is not really archived in the new database because it is not actually archived in the cache file
To cause data loss in a data vacuum or even in the case of uncertainty, bulk-insert must be used with bulk-deletes
Appear in pairs to ensure that the data is not deleted by the old library until it is inserted into the new library.
This parameter can be used with-- low-priority-insert,--replace
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.