In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article shows you what the mysql load-related experimental process is like, the content is concise and easy to understand, it will definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.
One: the process of load is equivalent to: first start transaction, then insert data, and finally commit
I guess mysql differs from oracle sqlldr in that there are no parameters of the latter's rows to control the rows of data submitted each time.
But I feel that mysql is reading in batches by estimating a value. I don't think he is an insert.
Second: load if the data exists (primary key or unique key), the default is skipped, you can choose the existence of replace to replace!
Third: load does not have a rows parameter similar to oracle's sqlldr to control the number of rows submitted each time, so it can only be done through the linux command first.
Split into small files to achieve parallelism
Experiment 1: will load lock the table?
Session1
[root@beijing-fuli-hadoop-04 ~] # cat / data/t.txt
100, liu, 18
102, liu, 18
101, liu, 18
Root@localhost: (none) 11:50:05 > start transaction
Query OK, 0 rows affected (0.00 sec)
Root@localhost: (none) 11:51:08 > LOAD DATA LOCAL INFILE'/ data/t.txt' INTO TABLE liuwenhe.t fields terminated by', 'LINES TERMINATED BY'\ n'
Query OK, 3 rows affected (0.03 sec)
Records: 3 Deleted: 0 Skipped: 0 Warnings: 0
And then no, commit!
Session2
Wait for all as follows
Root@localhost: liuwenhe 11:52:36 > delete from t where id=101
Root@localhost: liuwenhe 11:52:36 > delete from t where id=102
Root@localhost: liuwenhe 11:52:36 > delete from t where id=103
Do not wait as follows
Delete from t where id=104
Delete from t where id=100
Conclusion:
Load will lock all load data before submitting it! It also indirectly means that this is a transaction that puts three data
Now that load is in, is it possible that mysql defaults to N rows of data as a transaction? Use a large amount of data for verification
Experiment 2: is load a transaction?
1. File / data/12.txt is a file with 26135101 lines of data
two。 And then start load.
Root@localhost: liuwenhe 13:54:50 > LOAD DATA LOCAL INFILE'/ data/12.txt' INTO TABLE liuwenhe.t fields terminated by', 'LINES TERMINATED BY'\ n'
3. Open another session, query the data, and find that it is empty until load is completed.
Root@localhost: liuwenhe 13:55:15 > select count (*) from t
+-+
| | count (*) |
+-+
| | 0 |
+-+
1 row in set (0.66 sec)
This further shows that the load operation is a transaction!
Lab 3: is load allowed on the same table at the same time? As long as there is no conflict, it can be parallel!
The conflict here refers to the conflict between the data that has been processed by load and the data to be processed by another session. The specific experiments are as follows:
If the 1.txt file is id data ranging from 1 to 2147483647, and 2.txt is id=2147483647
A piece of data, and 3.txt is the range of id from 1 to 3, and there is also the data of id=2147483646.
The details are as follows:
[root@beijing-fuli-hadoop-04 liuwenhe] # cat 2.txt
26293013,liu, 18
[root@beijing-fuli-hadoop-04 liuwenhe] # cat 3.txt
1, liu, 18
26293013,liu, 18
The specific experimental process:
Lab 1)
Session 1:
Execute this, because the amount of data is large, so it will be executed for a while.
Root@localhost: liuwenhe 13:54:50 > LOAD DATA LOCAL INFILE'/ data/liuwenhe/1.txt' INTO TABLE liuwenhe.t fields terminated by', 'LINES TERMINATED BY'\ n'
Session 2:
[root@beijing-fuli-hadoop-04 liuwenhe] # cat 2.txt
26293013,liu, 18
Then session 1 is not over yet, do the following, and find that there is no wait! I did go in.
Root@localhost: liuwenhe 13:54:50 > LOAD DATA LOCAL INFILE'/ data/liuwenhe/2.txt' INTO TABLE liuwenhe.t fields terminated by', 'LINES TERMINATED BY'\ n'
Root@localhost: liuwenhe 17:33:18 > select * from t where id = 26293013
+-+
| | id | name | num | |
+-+
| | 26293013 | liu | 18 |
+-+
1 row in set (0.12 sec)
Description: load executes sequentially. When the data of the executed id=1 reaches the innodb layer, mysql will lock the data of the id=1 on the gap lock.
At this time, your load=1 data will have a lock waiting, but if you do not execute the id=26293013 data, you will not lock this data, so you will not wait if you execute another load (id=26293013) data in parallel.
Lab 2)
Session 1:
Execute this, because the amount of data is large, so it will be executed for a while.
Root@localhost: liuwenhe 13:54:50 > LOAD DATA LOCAL INFILE'/ data/liuwenhe/1.txt' INTO TABLE liuwenhe.t fields terminated by', 'LINES TERMINATED BY'\ n'
Session 2:
Before session 1 ends, perform the following discovery wait, because the data of id=1 is locked by session 1, so the following operation needs to wait, because load 3.txt processes the data of id=1 first, but it is locked.
[root@beijing-fuli-hadoop-04 liuwenhe] # cat 3.txt
1, liu, 18
26293013,liu, 18
Root@localhost: liuwenhe 13:54:50 > LOAD DATA LOCAL INFILE'/ data/3.txt' INTO TABLE liuwenhe.t fields terminated by', 'LINES TERMINATED BY'\ n'
Experiment 3) load produces deadlocks:
Session 1:
Execute this, because the amount of data is large, so it will be executed for a while.
Root@localhost: liuwenhe 13:54:50 > LOAD DATA LOCAL INFILE'/ data/liuwenhe/1.txt' INTO TABLE liuwenhe.t fields terminated by', 'LINES TERMINATED BY'\ n'
Session 2:
Before session 1 ends, perform the following discovery wait, because the data of id=1 is locked by session 1, but the data of id=26293013 is not locked, so when you say load 4.txt, you can load the first data (id=26293013) into the innodb engine layer and lock it, but 1 this data is locked, and then session 1 and session 2 produce lock waiting!
[root@beijing-fuli-hadoop-04 liuwenhe] # cat 4.txt
26293013,liu, 18
1, liu, 18
Root@localhost: (none) 18:13:10 > LOAD DATA LOCAL INFILE'/ data/liuwenhe/4.txt' INTO TABLE liuwenhe.t fields terminated by', 'LINES TERMINATED BY'\ n'
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
Note: why did you choose to roll back the transaction for session 2? Because I turned on deadlock detection, and then the database chose to insert updates or delete the least number of rows of transaction rollback
How does MySQL handle deadlocks?
MySQL has two ways to handle deadlocks:
Wait until the innodb_lock_wait_timeout=50s.
Initiate deadlock detection, actively roll back one transaction, and let other transactions continue to execute (innodb_deadlock_detect=on).
Due to performance reasons, deadlock detection is generally used to deal with deadlocks.
Deadlock detection
The principle of deadlock detection is to construct a directed graph with transaction as vertex and lock as edge to judge whether there is a loop in the directed graph.
Roll back
After a deadlock is detected, select the transaction rollback with the least number of rows inserted, updated or deleted, based on the trx_weight field in the INFORMATION_SCHEMA.INNODB_TRX table.
The above content is what the mysql load-related experimental process is like. Have you learned any knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.