Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to generate massive Test data by SQL tuning

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

Editor to share with you how SQL tuning to generate a large amount of test data, I believe that most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's go to know it!

In the scenario, if slow SQL occurs and DBA plus index optimization is required, how do you know that the added index is valid? This needs to be tested and adjusted over and over again, and you can't test it directly with the online database. The general method is to set up a test table in the test environment, then copy some data from the online library into the test environment, and then add indexing and explain.

But sometimes, the amount of data exported is small, the execution plan can not see the effect, and the amount of exported data is large, which will flush the buffer pool of the online machine and affect the IO. If only there was a tool that could generate data directly, it would be nice to generate the same 1 million or 10 million as online.

In the previous sysbench stress test, there was a function to generate data, which generated 1 million data like this.

Sysbench-test=oltp-mysql-table-engine=myisam-oltp-table-size=1000000\-mysql-socket=/tmp/mysql.sock-mysql-user=test-mysql-host=localhost\-mysql-password=test prepare

But the structure of the generated table is fixed, and so is the SQL statement for stress testing, so it is impossible to debug the SQL statement on the line.

CREATE TABLE `sbtest` (`id` int (10) unsigned NOT NULL auto_increment, `k` int (10) unsigned NOT NULL default '0mm, `c` char (120) NOT NULL default'', `pad` char (60) NOT NULL default'', PRIMARY KEY (`id`), KEY `k` (`k`))

Is there a tool that creates a user-defined table structure and generates tens of millions of data on that table structure? There is a tool called datagen, which links at the end of the article.

Drwxr-xr-x. 2 root mysql 4096 Sep 27 2016 bizsqldrwxr-xr-x. 2 root mysql 4096 May 31 20:51 conf-rw-r--r--. 1 root mysql 23698092 Sep 27 2016 datagen.jar-rwxr-xr-x. 1 root mysql 147 Sep 27 2016 datagen.sh-rw-rw-r--. 1 root mysql 31599 May 31 20:54 envbuilder.log-rw-r--r--. 1 root mysql 1741 May 31 20:53 example.schema-rw-r--r--. 1 root mysql 1336 May 31 09:42 example.schema_backup-rw-r--r--. 1 root mysql 2062 Sep 27 2016 readme

In two simple steps, write the table structure you want and how many pieces of data you want to generate into the example.schema file, for example, if you want to generate 1 million pieces of data, add a comment / * {RC {1000000}} * / at the end of the table

CREATE TABLE `test`.`tbl _ test` (`test`BIGINT (20) DEFAULT'0', `star`INTEGER (10) DEFAULT'0', `view_ count`INTEGER (11) DEFAULT'0', `bean`INTEGER (11) DEFAULT'0', `nearby`INTEGER (11) DEFAULT'0', PRIMARY KEY (post_id), INDEX (poster_uid) COLLATE='utf8mb4_general_ci' ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 / * {RC {1000000}} * /

Step 2, fill in the account password to connect to the test database, just add a line

Vi conf/datagen.xml

Then run the shell script to build tables into the test library and insert data

[root@localhost datagen] # / bin/bash datagen.sh

[2017-05-31 08:53:15] [WARN] [DataGen: 184]-Parsing ddl... [2017-05-31 08:53:15] [WARN] [DataGen: 187]-Creating table... [2017-05-31 08:53:15] [WARN] [MultiThreadPrepareDataComparator:508]-Preparing generators... [2017-05-31 08:53:15] [WARN] [MultiThreadPrepareDataComparator:510]-Generating dynamic data... [2017-05-31 08 54:34] [WARN] [MultiThreadPrepareDataComparator:526]-Generate done.

In the test library, 1 million pieces of data will appear.

Mysql > select count (*) from test.tbl_test;+-+ | count (*) | +-+ | 1000000 | +-+ 1 row in set (0.16 sec)

Now you can index the real SQL statement online.

Mysql > explain select post_id from test.tbl_test where post_type 1 and check_status = 9 and flag = 1 and post_time

< 1496178301 order by post_time asc limit 200; \G+----+-------------+----------+-------+---------------+-----------+---------+------+--------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+----------+-------+---------------+-----------+---------+------+--------+-------------+| 1 | SIMPLE | tbl_test | range | post_time | post_time | 9 | NULL | 501491 | Using where |+----+-------------+----------+-------+---------------+-----------+---------+------+--------+-------------+1 row in set (0.00 sec)ERROR: No query specified 加索引 mysql>

Alter table test.tbl_test add index idx_f (check_status,flag,post_type,post_time); Query OK, 0 rows affected (4.45 sec) Records: 0 Duplicates: 0 Warnings: 0

Explain again, scan 500000 lines and change 2 lines.

Mysql > explain select post_id from test.tbl_test where post_type 1 and check_status = 9 and flag = 1 and post_time < 1496178301 order by post_time asc limit 200 PUBG, music, talk, talk. -+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | + -+ | 1 | SIMPLE | tbl_test | range | post_time Idx_f | idx_f | 15 | NULL | 2 | Using where Using index Using filesort | +- -+ 1 row in set (0.00 sec)

After debugging the index and confirming that the SQL can be optimized, add the index to the online environment.

Of course, there are some very powerful functions.

For example, for a certain field, only a few specified values appear, such as the status status field 0re1 and 2, and the probability of each state appearing.

For example, the user UID on the analog line can limit the range of random numbers in a certain field, from 00000001 to 899999999, etc.

These are all the contents of the article "SQL tuning how to generate massive Test data". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report