MySQL Architecture and Storage engine, Lock, transaction, Design Analysis 07/09 Update SLTechnology News&Howtos

MySQL Architecture and Storage engine, Lock, transaction, Design Analysis

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article mainly explains "MySQL architecture and storage engine, lock, transaction, design analysis". The explanation in this article is simple and clear, easy to learn and understand. Please follow the editor's train of thought to study and learn "MySQL architecture and storage engine, lock, transaction, design analysis".

Measurement indicator TPS

Transactions Per Second (number of transactions per second), which refers to the number of transactions processed by the server per second, and a performance indicator specific to storage engines that support transactions, such as InnoDB. Success of transaction execution and number of transaction rollbacks per unit time

TPS= (COM_COMMIT + COM_ROLLBACK) / UPTIME

QPS

Queries Per Second (query processing per second), suitable for both InnoDB and MyISAM engines.

QPS= QUESTIONS/UPTIME

Wait time: the wait time between executing a SQL and waiting for a result to be returned.

Sequence

MySqlSlap

Stress testing tools are officially available in later versions of mysql5.1.4.

Create a schema,table,testdata.

Run a load test that can use multiple concurrent client connections.

Clean up the test environment (delete created data, tables, disconnect)

For more information, you can view it by man mysqlslap or by using mysqlslap-help

Instruction meaning-number of concurrency concurrency, multiple can be separated by commas-there can be multiple engines to be tested by engines, separated by delimiters, such as how many times engines=myisam,innodb-iterations has to run these tests-auto-generate-sql tests with system-generated SQL scripts-auto-generate-sql-load-type tests whether read or write or a mixture of the two (read,write,update,mixed)-how many queries does number-of-queries have to run in total. The number of queries run by each customer can be calculated by the total number of queries / concurrency-debug-info additional output CPU and memory related information-number of int fields in which number-int-cols creates test tables-number of chatter fields in number-char-cols creation test tables-database-query 's own SQL scripts for create-schema tests execute tests-only-print can use this option if you just want to print to see what the SQL statement is

Several demo are as follows:

Mysqlslap-umysql-P123-concurrency=100-iterations=1-auto-generate-sql--auto-generate-sql-add-autoincrement-auto-generate-sql-load-type=mixed-engine=myisam-number-of-queries=10-debug-info

Mysqlslap-h292.168.3.18-P4040-concurrency=100-- iterations=1-- create-schema='test'-- query='select * from test;'-- number-of-queries=10-- debug-info-umysql-p123

Mysqlslap-uroot-p123456-- concurrency=100-- iterations=1-- engine=myisam-- create-schema='haodingdan112'-- query='select * From order_boxing_transit where id = 10'--number-of-queries=1-- debug-info

MySQL architecture diagram

Connection layer 1

When the MySQL starts (the mysql server is a process), wait for the client link, each client's link request. Each client connection request server will create a new thread for processing (if the server is a thread pool, the thread pool will allocate an idle thread). Each thread is independent and has its own independent memory space. If this request is a query, it doesn't matter. However, if it is modified, two threads modify the same memory will cause data synchronization problems, and locks need to be introduced.

Connection layer 2

When the client connects to the server, the server also verifies the customer, that is, the user name, IP, password, to determine whether the connection can be made. After the connection, it also determines whether it has permission to execute a special query, such as read / write or specified table.

Engine layer

The important functions of this layer are: parsing SQL statements, optimizing, caching (caching queried data, caching executed SQL) queries.

MySQL's built-in function implementation, cross-storage engine functionality (the so-called cross-storage engine is the function that each engine needs to provide (the engine needs to provide external structure). Such as stored procedures, triggers, views, etc.

If it is a query statement such as select, it will first query whether the cache already has a corresponding result, and return if so. If not, proceed to the next query.

Parsing query, creating an internal data structure and other parsing tree, parsing tree is mainly used for semantic and grammatical gaps such as SQL statements.

Optimization: optimize SQL statements, such as rewriting queries, determining the reading order of tables, and selecting the desired index. Users can query at this stage to see how the server is optimized, and it also involves the storage engine, such as operation overhead information, and whether there is query optimization for a specific index.

Query whether to enable caching

Show variables like'% query_cache_type%'

Set cache size

Set Global query_cache_size = 4000

View the data save directory

Show variables like'% datadir%'

Parsing query

Storage layer

The data is stored on the file system of the bare device to complete the interaction with the storage engine.

MySQL image map

Storage engine

Query mysql to provide access to the storage engine

Show engines

Check your default storage engine for mysql

Show variables like'% storage_engine%'

Key MyISAM

Before MySQL5.5, for the default storage engine, the MyISAM storage engine was created by

Table.MYD: storing data

Talbe.MYI: storing indexes

Talbe.frm: storage table format (also available in InnoDB)

Properties:

Concurrency and lock level-table level locks, which do not support transactions and are not suitable for frequent modifications.

Support full-text retrieval

Support for data compression myisampack-b-f table.MYI

Use the scene:

Non-transactional applications (data warehouse, report, log data)

Read-only application

Spatial class applications (with spatial functions and coordinates)

Key InnoDB

Default storage engine for later versions of MySQL5.5

Storage pair is suitable to have innodb_file_per_table attribute

ON: independent alignment of tablespaces (table.frm,table.ibd)

OFF: system tablespace (ibdataX)

MySQL5.6 used to default to system tablespaces, but after 5.6it is recommended to use independent tablespaces by default.

The system tablespace cannot simply shrink the file size

Independent tablespaces can shrink system files through optimize table

The system table space stores all the data, which will cause IO bottleneck.

Independent tablespaces can refresh data to multiple files at the same time.

Properties:

InnoDB is a transactional storage engine

Fully support the ACID features of transactions

RedoLog and Undo Log

InnoDB supports row-level locks (concurrent Chengdu higher)

Use the scene:

InnoDB is suitable for most OLTP applications (On-Line Transaction Processing online transaction processing (OLTP), also known as transaction-oriented processing).

Storage engine comparison

CSV

Features:

Data storage in csv format

All columns are not too young for null.

Indexing is not supported (not suitable for large tables, not suitable for online processing)

You can edit the csv file manually (enter is suitable for editing, and finally flush tables)

Scene:

General finance and other people use

Archive

Composition: compress table data with zlib, and disk IO is smaller

Features:

Only insert and select statements are supported

Only allow indexing on self-incrementing ID

Use the scene:

Application of log and data collection

Memory

File system storage features, also known as HEAP storage engine, data is stored in memory, power outage is not.

Support HASH index and BTree index

All fields are lonely length varchar (10) = char (10)

Large fields such as Blog and Text are not supported

The Memory storage engine uses table-level indexes

The maximum value is determined by the max_heap_table_size parameter.

Differences between memory tables (Memory) and temporary tables (Memory)

Memory tables, that is, tables placed in memory, the size of memory used can be specified by max_heap_table_size in My.cnf, such as max_heap_table_size=1024M, memory tables are not the same as temporary tables, temporary tables are also stored in memory, and the maximum memory required for temporary tables needs to be set by tmp_table_size = 128m. When the data exceeds the maximum value of the temporary table, it is automatically converted to a disk table. Due to the need for IO operation, the performance will be greatly degraded, but the memory table will not. When the memory table is full, it will prompt that the data is full of errors.

Both temporary tables and memory tables can be created manually, but temporary tables play a more important role in organizing data to improve performance after being created by the system itself. for example, temporary tables cannot be shared among multiple connections, such as subqueries.

Ferderated

It is not enabled by default and needs to be manually configured.

Features:

Provide remote access to the table size on the MySQL server

The data is not stored locally, and the data is on the remote server

Table structure and remote server connection information need to be saved locally

Scene:

Occasionally large statistical analysis and manual query

Lock

Multithreading operations on the same object need to be locked.

Lock computer coordinates multiple processes or threads to access a resource pair mechanism concurrently

In the database, data is also a kind of shared resource provided to many users. How to ensure the consistency and effectiveness of data concurrent access is a problem to solve, and lock conflict is also an important factor affecting the performance of database concurrent access.

Locks are important and complex for the database.

Lock types in mysql

Table-level lock: low overhead, fast locking; no deadlock; large lock granularity, the highest probability of lock conflict and the lowest concurrency. More queries and less modification of applications

Row-level locks: expensive and slow to add locks; deadlocks occur; locking granularity is the smallest, the probability of lock conflicts is the lowest, and the degree of concurrency is the highest. Support concurrent queries and updates.

Page lock: the overhead and locking time are between table lock and row lock; deadlocks occur; lock granularity is between table lock and row lock, and the concurrency is general.

MySQL lock mechanism is simple. Different engines support different lock levels by default. MyISAM supports table-level locks (table-level locking). InnoDB supports both row-level locks (row-level locking) and table-level locks. The default row-level locks are.

There are two modes of table-level locks for MySQL:

Table shared read lock (Table Read Lock)

Table exclusive write lock (Table Write Lock)

Locking in mysql is automatically optimized in the background, and it can also be manually added to understand the underlying features of mysql.

MyISAM lock

Summary of MyISAM shared read lock and exclusive write lock:

When reading a MyISAM table, it does not block other users' read operations to the same table, but it does block write operations to the same table.

The read operation on the MyISAM table will not block the current user from reading the table, and will be saved when the table is modified.

A session uses LOCK table to add a read lock to the table, and the session can be added, deleted, changed and queried. At that time, the current session can no longer access or update other tables.

Another session can query the table pair records, but the update will have to wait for the right time.

Writing to the MyISAM table blocks other users from reading and writing to the same table.

For write-to-write operations on MyISAM tables, the current session can CRUD tables, but it will report an error when operating on other tables.

Read demo:

Lock table testmysam READ

Start another session select * from testmysam to query

Insert into testmysam value (2); update testmysam set id=2 where id=1; error report

3. Insert into testmysam value (2) in another session; wait

4. Insert into testdemo value in the same session; report select * from testdemo; report error

5. In another session, insert into testdemo value (2, 2, 2, 5, 3, 3); success

6. Lock select s. * from testmysam s error report in the same session

Lock table table name as alias read

Check how many times the show status LIKE 'table_locks_waited' table has been locked

Write demo:

1.lock table testmysam WRITE in the same session insert testmysam value (3) (OK); delete from testmysam where id = 3 (OK); select * from testmysam (NO)

two。 Operate on different tables (error reports) select s. * from testmysam s; insert into testdemo value (2 minutes 2 minutes 3')

3. In other session (wait) select * from testmysam

InnoDB lock

Row locks are supported in MySQL's InnoDB engine.

Shared lock (read lock), when a transaction reads locks on certain lines, allows other transactions to read these lines, does not allow other transactions to write to these lines, and does not allow other transactions to put exclusive locks on these lines, but can be read locked.

Exclusive lock (write lock), when a transaction writes locks on several lines, no other transactions are allowed to write, but reads are allowed. What's more, other transactions are not allowed to put any locks on these lines, including write locks.

Syntax:

Shared lock writing method: lock in share mode

Select * from table where conditional lock in share mode

Top row other lock writing method: for update

Select * from table where conditional for update

InnoDB row lock

Two transactions cannot lock the same index

Insert,delete,update automatically adds exclusive locks to the transaction.

The row lock can only be realized with an index, otherwise it will not be a row lock if the whole table is locked automatically.

Demo:

BEGIN select * from testdemo where id = 1 for update

In another session, update testdemo set C1 ='1' where id = 2 successful update testdemo set C1 ='1' where id = 1 wait

BEGIN update testdemo set C1 ='1' where id = 1 wait in another session update testdemo set C1 ='1' where id = 1

BEGIN update testdemo set C1 ='1' where C1 ='1' in another session update testdemo set C1 ='2' where C1 ='2' waiting for C1 to have no index.

InnoDB table locks are not much different from MyISAM, but the table is unlocked when a new transaction is started.

Extension: the system allows for a period of time, a large amount of data, system upgrade, A table to add fields, day and night concurrency are large, how to modify the table structure.

Create an A1 that is the same as table A but whose information is empty

Modify the data of A1, and then copy the A data into A1.

Create a trigger in table A to automatically update all the new data in table A to the table.

After copy is finished, you can automatically rename.

The above steps automate the implementation of available tools pt-online-schema-change

Business

MySQL transactions are mainly used to deal with data with large amount of operations and high complexity. For example, in the personnel management system, if you delete a person, you need to delete not only the basic information of the person, but also the information related to the person, such as mailboxes, articles, etc., so that these database operation statements constitute a transaction!

Implicit transactions: transactions have no obvious sign of opening or ending. In mysql, autocommit is turned on by default.

Check to see if transactions are supported under the database

Show engines

View mysql's current default storage engine

Show variables like'% storage_engine%'

View the storage engine for a table

Show create table table name

Modify table storage structure

Create table () type=InnoDB

Alter table table type=InnoDB

Generally speaking, a transaction must satisfy four conditions (ACID): atomicity (Atomicity, or indivisibility), consistency (Consistency), isolation (Isolation, also known as independence), and persistence (Durability).

Atomicity: all operations in a transaction are either completed or not completed, and do not end at some point in the middle. An error occurs during the execution of a transaction and is Rollback back to its state before the transaction starts, as if the transaction had never been executed.

Consistency: before the transaction starts and after the transaction ends,. This means that the data written must be completed and the integrity of the database is not compromised in accordance with all the preset rules, including the accuracy and concatenation of the data, and the subsequent database can spontaneously complete the scheduled work.

Isolation: the ability of a database to allow multiple concurrent transactions to read, write and modify its data at the same time. Isolation can prevent data inconsistencies caused by cross execution when multiple transactions are executed concurrently. Transaction isolation is divided into different levels, including read uncommitted (Read uncommitted), read commit (read committed), repeatable read (repeatable read) and serialization (Serializable).

Persistence: after the end of the transaction, the data will not be lost, even if the system modification is a permanent failure.

Key point: isolation

There are four isolation levels of transactions under mysql, and the order from low to high is Read uncommitted, Read committed, Repeatable read (default) and Serializable. The last three levels of these four levels can solve the problems of dirty reading, non-repeatable reading and phantom reading one by one.

Dirty read: transaction A reads the data updated by transaction B, and then B rolls back, so A reads that the data is dirty.

Can not be read repeatedly, transaction A reads the same data many times, and transaction B updates and commits the data in the process of multiple reading of transaction A, resulting in inconsistent results when transaction A reads the same data many times. Focus on modification. As long as you lock the lines that meet the conditions.

Illusion: administrator Xiao Wang divides the scores of middle school students in the database into ABCDE grades according to scores, but someone inserts a specific score during execution, causing A to find that a record has not been modified, just like a hallucination. Focus on adding or deleting tables to lock!

Read uncommitted did not submit read

All transactions can see the data that did not commit the transaction. The performance is the best, and there is almost nothing transactional.

Read committed submitted for reading

There may be multiple read differences!

Repeatable repeat reading

When multiple instances of the same transaction read data, you must wait until other operations such as the current data transaction are completed. The default level of mysql is.

Serializable serializable

Sort forcefully, adding a shared lock on each read row. It will lead to a lot of timeouts and lock competition.

As for how to avoid looking at the table above. Different transaction levels can solve different problems.

Transaction usage:

1. Implement it with BEGIN, ROLLBACK and COMMIT.

BEGIN starts a transaction

ROLLBACK transaction rollback

COMMIT transaction confirmation

2. Directly use SET to change the automatic submission mode of MySQL:

SET AUTOCOMMIT=0 forbids automatic submission

SET AUTOCOMMIT=1 enables autocommit (system default)

Business design logic design paradigm design

The first paradigm

The first paradigm is that properties are inseparable and each field should be inseparable.

Atomicity. For example, a field is the name (NAME), in China, it is usually understood that the name is an inseparable unit, which is in line with the first paradigm; but in foreign countries, it is also divided into FIRST NAME and LAST NAME, then the name field can also be split into smaller units of the field, it is not in line with the first paradigm.

The second largest paradigm

There should be a primary key in the table, and other fields in the table depend on the primary key.

The second paradigm just needs to be remembered. Primary key constraint for example, if there is a table that is a student table, and there is a unique field student number in the student table, then all other fields in the student table can be obtained according to this student number field, depending on the meaning of the primary key, that is, the relevant meaning. Because the value of the student number is unique, it will not cause the problem of misalignment of the stored information, that is, the name of student 001 will not be stored in student 002.

The third paradigm

The third paradigm requires that there can be no fields in the table that exist in other tables and store the same information.

Usually the implementation is to establish the association through the foreign key, so the third paradigm just needs to remember the foreign key constraint. For example, if a table is a student table, and there are fields such as student number and name in the student table, then if his department number, dean and dean are also stored in this student table, it will result in a large amount of data redundancy. One is that the information already exists in the department information table, and the other is that if there are 1000 students in the department, the information will be saved 1000 times. Therefore, the practice of the third paradigm is to add a field of department number (foreign key) to the student table and associate it with the department information table.

Advantages and disadvantages of stylized design

Advantages:

Data redundancy can be reduced as much as possible

A normalized update operation is faster than anti-normalization.

Normalized tables are usually smaller than anti-normalized tables.

Disadvantages:

Multiple tables need to be associated with a query

It is more difficult to optimize the index

Anti-paradigm design

The anti-paradigm is aimed at the paradigm. The paradigm of database design is introduced earlier.

The so-called anti-normalization is to appropriately violate the requirements of the database design paradigm for the sake of performance and reading efficiency.

A small amount of redundancy is allowed, in other words, anti-normalization is to use space in exchange for time.

Advantages and disadvantages of anti-stylized design

Advantages:

You can reduce the association of tables

It can optimize the index better.

Disadvantages:

There are data redundancy and abnormal data maintenance.

The modification of the data requires more cost

Physical design

The storage structure of the logical model is designed according to the characteristics of the selected relational database.

Naming convention

The naming of databases, tables and fields should follow the principle of readability

Library object names formatted in upper and lower case for good readability

For example, use custAddress instead of custaddress to improve readability.

The naming of databases, tables and fields should follow the principle of ideality.

The name of the object should describe the object it represents

For example, the name of the table should reflect the content of the data stored in the table, and the function of the stored procedure should be reflected in the stored procedure.

The naming of databases, tables and fields should follow the principle of long name

Use as little or no abbreviations as possible

Storage engine specification

Choose the appropriate storage engine according to the business rules, and the advantages and disadvantages of different engines are also discussed. InnoDB or MyISAM is generally used in the work.

Data type specification

Select the appropriate byte type for the fields in the table, when a column can select multiple data types

Give priority to numeric types

Followed by the type of date and time

Finally, there are character types.

For data types of the same level, priority should be given to data types that take up less space.

Key points:

For things with high precision, such as money, using the decimal type, float,double will not be considered because they are prone to errors.

The difference and Choice between datetime and timestamp

Timestamp, which occupies only 4 bytes, automatically retrieves the current time zone and converts it, but cannot save NULL.

Datetime is stored in 8 bytes, does not search the time zone, and can store NULL

Thank you for your reading, the above is the content of "MySQL architecture and storage engine, lock, transaction, design analysis". After the study of this article, I believe you have a deeper understanding of MySQL architecture and storage engine, lock, transaction, design and analysis, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.