Mysql Learning 6: chapter 3 MYSQL Architecture and Storage engine 07/13 Update SLTechnology News&Howtos

Mysql Learning 6: chapter 3 MYSQL Architecture and Storage engine

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

1. Chapter 3 MYSQL architecture and storage engine 1.1. Mysql architecture

The architecture is divided into two layers:

L mysql server layer: including connection layer, SQL layer

L Storage engine layer

1.2. Query cache details and shutdown

Query cache can only cache static data. Suitable for data warehouse.

It is enabled by default before 5.6 and disabled by default after 5.6.

Completely shut down query cache

Two core parameters related to query_cache

Mysql > show variables like "query_cache_size%"

Mysql > show variables like "$query_cache_type%"

Query_cache_type must be set to not off in the first place

Note:

Stress testing tool sysbench is an open source, modular, cross-platform multithreading performance testing tool, which can test the performance of CPU, memory, disk IO, thread and database. , which supports mysql,oracle,postgresql

Download address: https://dev.mysql.com/downloads/benchmarks.html

For the specific testing process, see P44.

1.3. Storage engine

The storage engine MyISAM is recommended to deactivate and switch to the InnoDB storage engine.

1.4. InnoDB architecture 1.4.1. Database and database instance

Mysql database is a database of single-process and multi-thread model.

L instance is a combination of process and memory.

L thread: flushes memory data to the hard disk.

L disk files: redolog, data files, Undo log

1.4.2. Storage structure of InnoDB

InnoDB logical storage unit is mainly divided into table space, segment, area, page.

Hierarchical relationship: tablespace > segment > extent (64 page,1M) > page

Tablespace

System tablespace; ibdata1

Shared tablespaces independent tablespaces are better

Independent tablespace: each table has its own tablespace, which can be transferred and recycled easily. Alter table tablename engine=innodb or pt-online_schema_change is fine.

Shared tablespaces cannot be recycled online and need to be exported, deleted and then imported. Statistical analysis and log classes are not suitable.

Temporary tablespace and general tablespace have been added after mysql 5.7.

Segment

Data segment, rollback segment, index segment.

Each segment consists of N sections and 32 scattered pages.

Creating an index creates two segments, a non-leaf node and a leaf node segment.

A table has four segments, which is twice the number of indexes.

Zone

A contiguous composition of pages, a physically continuously allocated piece of space, each with a fixed size of 1m.

Page

The minimum physical storage allocation unit of InnoDB is page, which has data pages and rollback pages. In general, an area consists of 64 consecutive pages, and pages default to 16KB.

Mysql5.6 can be lowered to 8KB or 4KB

Mysql5.7 can be adjusted to 32KB or 64KB

Page records the information recorded by the row

The InnoDB storage engine is column-oriented, that is, data is stored in rows, and row record data is stored in row format.

The InnoDB storage engine has two file formats

1. Antelope: there are two line record formats: corrpact and redundant

2. Barracuda: there are two row record formats: commpressed and dynamic.

Row overflow: the data that needs to be stored is split into multiple pages for storage in addition to the current storage page.

Redundant: the earliest line record format, which consumes more storage space than compact, and is not recommended.

Commpressed: compress the row format, compress the database and index pages on the physical storage level, the memory is not compressed, and the call to memory needs to be converted, which consumes CPU. The compression ratio is only 1 Universe 2, which is not recommended.

Compact: the overflowed column holds only 768 prefix bytes.

Dynamic: the default row record format of the new version. The overflow data is stored in the overflow page, and only the pointer is stored in the data page. The new page where the overflow column is located is highly utilized and is recommended.

Mysql5.7 defaults to dynamic line record format and Barracuda file format

You can view it by viewing the parameter row_format

Show table status like'% user%'\ G

Row_format

View file format

Show variables like'% innodb_file%'

Innodb_file_format

1.4.3. Memory structure

Memory is divided into SGA (system global area) and PGA (program cache area).

View configuration parameters: show variables like'% buffer%'

For the introduction of parameters, see p54

SGA (system Global area)

Innodb_buffer_pool: cache data, indexes, insert buffers, data dictionaries and other information of InnoDB tables

Innodb_log_buffer: the buffer of the transaction in memory, that is, the size of the redo log buffer

Query Cache: high-speed query cache, which is recommended to be turned off in production environment. Only static data can be cached. Suitable for data warehouse.

Key_buffer_size: only used for MyISAM storage engine table, cache MyISAM storage.

Innodb_additional_mem_pool_size: memory pool size that holds data dictionary information and other internal data structures, removed in 5.7.4.

SGA (program buffer)

Sort_buffer_size: used for temporary sorting of SQL statements in memory.

Join_buffer_size: table join use, for BKA

Read_buffer_size: the cache of table sequential scans, which can only be used by the MyISAM storage engine.

Read_rnd_buffer_size: random read buffer size, used for mrr.

Special

Tmp_table_size:SQL statements use temporary tablespaces without using indexes when sorting or grouping.

Max_heap_table_size: manage heap, memory storage engine tables.

Query parameters:

[mysql] > show variables like'%heap%'

[mysql] > show variables like'%tmp_table%'

The two parameters are based on the minimum, and it is recommended to set them to the same size.

Default_tmp_storage_engine: storage engine for temporary tables by default

Interal_tmp_disk_storage_engine: disk temporary table management, decision (create temporary table), 5.7 added.

1.4.4. Buffer status and its linked list structure

Page is the smallest IO unit of an InnoDB disk, which corresponds to a buffer in memory, and each buffer is divided into three states:

L free buffer:: is not used.

L clean buffer:buffer data is consistent with disk page data.

L dirty buffer: memory data has not been written to disk.

Three different buffer derived three linked lists

1.4.5. Each major refresh thread and its function

Master thread thread: background main thread with the highest priority. There are four cycles inside:

L main cycle loop

L cycle background loop in the background

L refresh circular flush loop

L pause the loop suspend loop.

Switch between 4 cycles according to the running status of the database. There are two more operations in the loop main loop, which are divided into:

Four IO threads

L read thread: database read and write request thread, default 4, can be expanded.

L write thread:

L redo log thread: the contents of the log buffer are flushed to redo log

L change buffer thread: flushes the contents of the inserted buffer to disk.

Page cleaner thread is the thread responsible for dirty page refresh. Show variables like'% innodb_page%'

Purge thread: responsible for deleting useless undo pages. Since the operation of DML statements will generate UNDO, the system needs to clean up undo pages on a regular basis, so purge operation is required.

Checkpoint: during the redo log switch, performing the checkpoint,redo log switch triggers the flushing of dirty pages to disk.

Error monitor thread: the monitoring thread responsible for reporting database errors

Lock monitor thread is responsible for monitoring the thread of the lock.

1.4.6. Memory refresh mechanism

Three partial memory refresh

Redo log buffer, data buffer, binlog cache

Redo log

Redo the log file, record the changes in the transaction operation, and record the value of the modified data, regardless of whether the transaction is committed or not. At least 2 redo log, and the disk is named after ib_logfile (0Murn). It affects the condition of redo log buffer refreshing to disk.

a. Through innodb_flush_log_at_trx_commit control, they are 0pm 1pm 2, respectively.

0:redo log thread triggers redo log buffer and data write disk every 1 second, but transaction commit does not initiate refresh.

1: every time a transaction commits, redo log buffer is triggered and data is written to disk.

2: every time the transaction commits, redo log buffer writes are triggered, but the disk is not refreshed.

B.mater thread: refresh per second

C.redo log buffer, using more than one and a half to trigger refresh.

Binlog

The DML statement writes both redo log and binlog files.

Binlog is a binary log file called mysql, which is used for backup recovery and master-slave replication. The sync_binlog parameter determines the refresh condition

0: after the transaction is committed, the binlog_cache information is not flushed to disk immediately, but the filesystem is left to decide to synchronize.

Binlog_cache: flushes to disk every n times after the transaction is committed.

The double one mode to ensure database security: sync_binlog=1,innodb_flush_log_at_trx_commit=1

The difference between redo log and binlog

First, the recording content is different.

L binlog is a logical log that records changes to all data.

L redo log is a physical log that records changes in all InnoDB table data.

Second, the time of recording content is different.

L binlog records the DML and DDL SQL statements after the commit is finished.

L redo log records the DML and DDL SQL statements after the transaction is initiated.

Third, different ways of using files

L binlog is not recycled. A new file will be generated after it is full or the instance is restarted.

L relo log is recycled, and the last one is full and then the first one.

Fourth, the functions are different.

L binlog can be used to recover data and build master-slave replication.

L redo log is used for data recovery after abnormal shutdown or media failure.

MySQL two-phase commit process:

The two-phase commit is divided into prepare and commit phases.

L preparation phase (transaction prepare): the transaction SQL is first written to redo log buffer, then a transaction preparation flag is made, and then the data in log buffer is refreshed to redo log.

Commit phase (commit): writes the binlog generated by the transaction to a file and brushes it to disk.

L then make a transaction commit flag in redo log and write the tag that was successfully written by binlog to the redo log file.

Summary: as long as the binlog write is complete, the transaction completes normally in the master-slave replication environment.

Refresh mechanism of dirty pages

When l redo log is switched, checkpoint is executed, which triggers the refresh of dirty pages.

L is controlled by the innodb_max_dirty_pages_pct parameter to represent 100% of the dirty page in the buffer pool. The threshold is reached to start the refresh. The default is 75%, and it is recommended to set 25% Mui 50% to avoid affecting performance later.

L the number of dirty pages per second is controlled by innodb_adaptive_flushing. Intelligent control is enabled by default.

1.4.7. Three characteristics of InnoDB

The three features are: insert buffer (change buffer), twice write (double write), and adaptive hash index (adaptive hash index).

Insert buffer:

Change the DML operation on the ordinary index from random IO to sequential IO to improve the efficiency of IO.

Principle: 1. First judge whether the inserted ordinary index page is in the buffer pool, if not, insert it directly; if not, put it into change buffer first, then merge change buffer and ordinary index, and merge multiple inserts into one operation to improve the insertion performance of ordinary index.

Parameters involved:

[mysql] > show variables like'%change%'

+-+ +

| | Variable_name | Value |

+-+ +

| | innodb_change_buffer_max_size | 25 | |

| | innodb_change_buffering | all |

| | session_track_state_change | OFF |

+-+ +

Innodb_change_buffer_max_size: the largest proportion of innodb_buffer_pool. The default is 25%, and 50% is recommended.

There are several types of innodb_change_buffering:change buffer.

L all: buffer all insert,delete tag operations and purge operations. It is recommended to choose the default all.

L none: close insert buffer

L inserts:insert tag operation

L delete:delete tag operation

L changes: no actual insert and delete, just mark and wait for subsequent purge

L purges: buffer the purges (physical deletion) operation of the background process.

Write twice (dourble write)

Insert buffering brings about an improvement in insert performance for ordinary indexes, while double write ensures inhaled security and prevents system downtime. InnoDB has a data page partial write (partial page write) problem. The redo log file records the physical record of the page. If the page is damaged, it can not be recovered. Use binlog to restore the original page, and then restore it through redo log.

The double write buffer is located in the storage area of the system tablespace in the following order:

Read write double write buffer before dirty pages in the InnoDB buffer pool are written to the data file.

L then write 1MB size data to disk shared tablespace (double write) twice from double write buffer.

Finally, write the data file from double write buffer.

Double write buffering does not take twice as much as IO,double is a large continuous fast that notifies the operating system through a fsync ().

Self-use hash indexing

InnoDB can monitor the search of the index, and if you notice that the query can be optimized by establishing a hash index, it will be completed automatically. It is controlled by innodb_adaptive_hash_index parameters.

Since 5.7.8, the adaptive hash indexing search system is partitioned, each index is bound to a special partition, and each partition has its own latch for maintenance. The partition is controlled by innodb_adaptive_hash_index_parts parameters, with a default value of 8 and a maximum of 512. through partition setting, contention can be reduced and concurrency can be improved.

The semaphores section outputted through the show engine innodb status command monitors the use of self-used hash indexes and their competition. If you see many threads waiting for a RW-latch created in btr0sea.c, it may be used to disable adaptive hash indexes.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.