What is the optimization method of MySQL billion-level data database? 07/04 Update SLTechnology News&Howtos

What is the optimization method of MySQL billion-level data database?

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "what is the method of optimizing MySQL 100 million-level data database". Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Next let the editor to take you to learn "MySQL billion-level data database optimization method is what" it!

Think about the performance of MySQL and the processing method of 100 million-level data, as well as how to do sub-database and sub-table, and in which scenario is more appropriate?

For example, the inquiry of bank transaction records

Limit a little salt, on the actual experimental process, the following is to do some operations in the course of the experiment, as well as stepped on some of the pits, I think the pits are very useful for readers.

First of all: establish a cash flow statement. Transaction history is the data type with the highest utilization rate and the largest amount of data retained in each financial system. The data search of the cash flow statement can be conducted according to the time range, the individual, and the amount.

-- establish a cash flow statement

DROP TABLE IF EXISTS `yun_ cashflow`

CREATE TABLE `yun_ cashflow` (

`id` bigint (20) NOT NULL AUTO_INCREMENT

`userid` int (11) DEFAULT NULL

`type`int (11) DEFAULT NULL COMMENT'1, enter the account, 2 withdraw'

`operatoruserid` int (11) DEFAULT NULL COMMENT 'operator ID'

`withdrawdepositid` bigint (20) DEFAULT NULL COMMENT 'withdraw ID'

`Money`double DEFAULT NULL COMMENT 'money count

`runid` bigint (20) DEFAULT NULL COMMENT 'work order ID'

`createtime` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP

PRIMARY KEY (`id`)

) ENGINE=InnoDB AUTO_INCREMENT=63 DEFAULT CHARSET=utf8

And then start to build 100 million data into it.

-- Loop insertion

Drop PROCEDURE test_insert

DELIMITER

CREATE PROCEDURE test_insert ()

Begin

Declare num int

Set num=0

While num < 10000 do

Insert into yun_cashflow (userid,type,operatoruserid,withdrawdepositid,money) values (FLOOR (7 + (RAND () * 6)) + FLOOR (22 + (RAND () * 9)), 1 flor (97 + (RAND ())

* 6) + FLOOR (2 + (RAND () * 9)), FLOOR (17 + (RAND () * 6)) + FLOOR (2 + (RAND () * 9)), FLOOR (5 + (RAND () * 6)) + FLOOR (2 + (RAND () * 9)

Set num=num+1

End while

END

Call test_insert ()

Pit one:

After this stored procedure was established, I found that the insertion of data was particularly slow. Less than 1 million pieces of data were inserted in a day and night, with an average of 40 to 60 pieces of data per second. I stopped several times and thought that the problem of random functions became constant, but the effect was the same, but it was still very slow. At that time, it made me feel pessimistic about this MySQL database. After all, I was used to Oracle, and the insertion speed was really fast, but the effort was worth the effort. It turns out that you can use another writing method to create data, the speed is very fast, on the code.

INSERT INTO example

(example_id, name, value, other_value)

VALUES

(100,100, 'Name 1', 'Value 1', 'Other 1')

(101,' Name 2', 'Value 2', 'Other 2')

(102,' Name 3', 'Value 3', 'Other 3')

(103,' Name 4', 'Value 4', 'Other 4')

It is in a loop that a lot of data is created in this format, separated by VALUES, and then written on it. I created 10, 000 pieces of data in Excel, pasted them out according to the sentence format, and became 10, 000 pieces of data per loop, so it didn't take long for 100 million pieces of data to be created.

Select count (*) from yun_cashflow

I am also curious about how much space is occupied by 100 million pieces of data in eight fields, and find the path of the data through the following statement.

Show global variables like "datadir%"

By looking at the file, it is 7.78GB, it seems that if there are not many fields and a large amount of data, it is actually not a problem. In fact, as an architect, this is the most simple, direct and rough conversion idea when estimating hard disk redundancy in machine configuration.

All right, the table is finished, and all kinds of experiments begin.

First of all, let's see what the conditions are.

Ha ha, Out of memory, it seems that this query is really into the memory, the whole memory smoke, it seems that 7.8g of data is put into memory, my memory is not so large.

The flow of funds will generally be inquired according to the time to see what the speed is.

Select * from yun_cashflow where createtime between '2018-10-23 09 and' 2018-10-23 09

I'll go and make up the brain. When you check the details of historical funds with this Alipay, 56 messages, 103.489 seconds, that is, the query speed of nearly 2 minutes, what kind of experience will you have? Oh, no, there are no conditions added to this, so let's try what the conditions of an unlimited time range for a user are.

Select count (*) from yun_cashflow where userid=21

It's also nearly a minute and a half, so try the terms of the amount.

Select count (*) from yun_cashflow where money

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.