Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Comparison of data Compression performance of mysql

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

The main content of this article is "data compression performance comparison of mysql". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Next let the editor to take you to learn "mysql data compression performance comparison" bar!

1. Test environment:

Hardware and software

A 64-bit 2.6.18-92 kernel development machine, 4G memory, 4 2800Mhz Dual-Core AMD Opteron (tm) Processor 2220 CPU.

MySQL is placed on a 7200 rpm SAT hard drive without raid.

MySQL turned off query cache without any optimization in order to prevent query cache from interfering with the test results.

Table structure

2424753 records, the actual data of a fragment in the production environment

The joint indexes of (partition_by1,idx_rank) and (partition_by1,chg_idx) are established respectively, where partition_by1 is a 32-length varchar type for retrieval, and the other two fields are floating-point numbers and are mostly used for sorting.

As a child increment, autokid acts as a PRIMARY KEY and is only used as a guarantee of atomicity when data is loaded, which has no practical significance.

two。 Purpose of the test:

Compressed space contrast

The higher the compression ratio, the smaller the disk space, which directly reduces the storage cost of the data.

Query performance comparison

Query performance should not degrade significantly after compression. Archive does not support indexing, so performance degradation is inevitable, so we should also have a spectrum in mind, how much the reduction is, and whether it is acceptable or not.

3. Test tools:

Slap

Official tools are, of course, the best choice. For an introduction to mysqlslap, please refer to the official documentation.

Test query

A total of 9973 SQL entries were intercepted from the production environment to access the topranks_ v3 table, from which 7 items with a large number of visits were extracted, 50 were sent concurrently, and repeated 10 times. The command is as follows:

. / mysqlslap-- defaults-file=../etc/my.cnf-ubiquitous debug-info4 *-packs-debug-info4 *-c50-i10-Q.. / debug-info4. Test conclusion

Comparison item disk space time (seconds) CPU Idle LOAD concurrent

Matrix (MyISAM) 403956004 2.308 30 15 50

ARCHIVE 75630745 > 300 75 4 1

PACK 99302109 2.596 30 22 50

Based on the test data given in the table above, we simply draw the following conclusions:

For the test table, the space occupied by Archive table is about 18.7% of that before, and the space occupied by Myisampack is about 24.6%. There is little difference between the two. From the point of view of space utilization, it seems that we need to choose archive table.

Let's look at the query performance and compare it with the matrix. In terms of total time consumption and system load, the query performance of the pack table under 50 concurrency is comparable to that of the matrix table, while the archive table takes more than 5 minutes in the case of single concurrency (I can't wait any longer, kill)!

So, we seem to conclude that the ARCHIVE engine basically doesn't have to think about tables that need to be queried online.

Why is the ARCHIVE engine so slow during this test?

We know that mysql provides archive as a storage engine to reduce disk overhead, but there is also a premise that archived data does not need or is rarely queried online, and it is okay to query slowly occasionally. For the above reasons, the archive table is not allowed to build indexes other than incremented columns.

With this consensus, let's take a test SQL to analyze why there is such a big difference in query performance before and after indexing. In our test SQL, there is this one:

SELECT c1,c2,...,cn FROM mysqlslap.rpt_topranks_v3

WHERE... AND partition_by1 = '50008090'

ORDER BY added_quantity3 DESC

LIMIT 500We said earlier that the table we tested has an index on the partition_by1 field, so we preliminarily judged that the partition_by1 index should be used in this query on the matrix and myisampack tables; EXPLAIN:

Mysql > EXPLAIN

-> SELECT FROM mysqlslap.rpt_topranks_v3

-> WHERE AND partition_by1 = '50008090'

-> ORDER BY added_quantity3 DESC

-> LIMIT 500\ G

* * 1. Row *

Id: 1

Select_type: SIMPLE

TABLE: rpt_topranks_v3

Type: ref

Possible_keys: idx_toprank_pid,idx_toprank_chg

KEY: idx_toprank_pid

Key_len: 99

Ref: const

Rows: 2477

Extra: USING WHERE; USING filesort

1 row IN SET (0. 00 sec) as expected, this query uses the index built on the partition_by1 field, matching the target number of rows to 2477, followed by a sort on the added_quantity3 field. Because added_quantity3 does not have an index, filesort is used.

Let's take a look at the EXPLAIN result of this SQL on the archive table:

Mysql > EXPLAIN

-> SELECT FROM mysqlslap.rpt_topranks_v3_archive

-> WHERE AND partition_by1 = '50008090'

-> ORDER BY added_quantity3 DESC

-> LIMIT 500\ G

* * 1. Row *

Id: 1

Select_type: SIMPLE

TABLE: rpt_topranks_v3_archive

Type: ALL

Possible_keys: NULL

KEY: NULL

Key_len: NULL

Ref: NULL

Rows: 2424753

Extra: USING WHERE; USING filesort

1 row IN SET (0. 00 sec) EXPLAIN said, "I don't have an index available, so I can only scan the whole table for 2424753 rows and then do a filesort." If you want to pursue performance, it is obviously a grievance to MySQL.

At this point, I believe that everyone on the "mysql data compression performance comparison" have a deeper understanding, might as well to the actual operation of it! Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report