What is the principle of mysql architecture? 07/06 Update SLTechnology News&Howtos

What is the principle of mysql architecture?

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "what is the principle of mysql architecture". Interested friends may wish to take a look at it. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what is the principle of mysql architecture"?

Principle of Mysql architecture 1. Mysql architecture

MySQL Server architecture can be divided into network connection layer, service layer, storage engine layer and system file layer from top to bottom.

Network connection layer

Client Connector (Client Connectors): provides support for establishing with the MySQL server. At present, almost all the mainstream server-side programming technologies are supported, such as Java, C, Python, .NET, etc., which establish a connection with MySQL through their respective API technologies.

Service layer (MySQL Server)

The service layer is the core of MySQL Server, which mainly includes six parts: system management and control tools, connection pool, SQL interface, parser, query optimizer and cache.

Connection pool (Connection Pool): responsible for storing and managing client connections to the database, and a thread is responsible for managing a connection.

System management and control tools (Management Services & Utilities): such as backup and recovery, security management, cluster management, etc.

SQL API (SQL Interface): used to accept various SQL commands sent by the client and return the results that the user needs to query. Such as DML, DDL, stored procedures, views, triggers, and so on.

Parser (Parser): responsible for generating a "parsing tree" from the requested SQL parsing. Then further check whether the parsing tree is legal according to some MySQL rules.

Query optimizer (Optimizer): when the parse tree passes the parser syntax check, it is handed over to the optimizer to convert it into an execution plan and then interact with the storage engine.

Select uid, name from user where gender = 1

Select the "projection -" connection strategy

Select selects according to the where statement first, not querying all the data and then filtering it.

Select query projects attributes according to uid and name, and does not take out all fields

Join the previous selection and projection to generate the query result

Cache&Buffer: the caching mechanism consists of a series of small caches. Such as table cache, record cache, permission cache, engine cache and so on. If the query cache has a hit query result, the query statement can directly fetch data from the query cache.

Storage engine layer (Pluggable Storage Engines)

The storage engine is responsible for the storage and extraction of data in MySQL and interacts with the underlying system files. The MySQL storage engine is plug-in, and the query execution engine in the server communicates with the storage engine through the interface, which shields the differences between different storage engines. There are many kinds of storage engines, each with its own characteristics, the most common of which are MyISAM and InnoDB.

System File layer (File System)

This layer is responsible for storing the data and logs of the database on the file system and completing the interaction with the storage engine, which is the physical storage layer of the file. Mainly includes log files, data files, configuration files, pid files, socket files and so on.

Log file

Record the SQL of all queries with timeout execution times. The default is 10 seconds.

Whether show variables like'% slow_query%'; / / is enabled or not

Show variables like'% long_query_time%'; / / duration

The change operations made to the MySQL database are recorded, and the occurrence time and execution time of the statement are recorded; but it does not record the SQL that select, show, and so on do not modify the database. It is mainly used for database recovery and master-slave replication.

Whether show variables like'% log_bin%'; / / is enabled or not

Show variables like'% binlog%'; / / Parameter view

Show binary logs;// View Log File

Record general query statement, show variables like'% general%'

Enabled by default, show variables like'% log_error%'

Error log (Error log)

General query Log (General query log)

Binary log (binary log)

Slow query log (Slow query log)

Configuration file

Used to store all the configuration information files of MySQL, such as my.cnf, my.ini, etc.

Data file

Db.opt file: record the default character set and verification rules used by this library.

Frm file: stores metadata (meta) information related to tables, including table structure definition information, and so on. Each table has an frm file.

MYD file: dedicated to the MyISAM storage engine, stores the data of the MyISAM table (data), and each table has a .MYD file.

MYI file: dedicated to the MyISAM storage engine, it stores the index-related information of the MyISAM table. Each MyISAM table corresponds to a .MYI file.

Ibd files and IBDATA files: data files (including indexes) that store InnoDB. InnoDB storage engine has two tablespace modes: exclusive tablespace and shared tablespace. Exclusive tablespaces use .ibd files to hold data, and each InnoDB table corresponds to an .ibd file. Shared tablespaces use .ibdata files, and all tables share one (or more, self-configured) .ibdata files.

Ibdata1 files: system tablespace data files, storing table metadata, Undo logs, etc.

Ib_logfile0, ib_logfile1 files: Redo log log files.

Pid file

Pid file is a process file of mysqld application in Unix/Linux environment. Like many other Unix/Linux server programs, it stores its own process id.

Socket file

Socket files are also available in the Unix/Linux environment. In the Unix/Linux environment, users can connect to MySQL directly using Unix Socket without going through the TCP/IP network.

2. MySQL operation mechanism

Establish a connection (Connectors&Connection Pool) to the MySQL through the client / server communication protocol. The communication mode between the MySQL client and the server is "half-duplex". For each MySQL connection, there is always a thread state that identifies what the connection is doing.

Id: thread ID, you can use kill xx

User: the user who started this thread

Host: IP and port number of the client that sent the request

Db: the library in which the current command is executed

Command: the operation command that the thread is executing

Time: indicates the time that the thread is in the current state, in seconds

State: thread statu

Info: generally records the statements executed by a thread, displaying the first 100 characters by default. Want to see the full use of show full processlist

Create DB: creating library operation

Drop DB: deleting library operation

Execute: executing a PreparedStatement

Close Stmt: shutting down a PreparedStatement

Query: executing a statement

Sleep: waiting for the client to send a statement

Quit: exiting

Shutdown: shutting down server

Updating: searching for matching records and making changes

Sleeping: waiting for client to send new request

Starting: performing request processing

Checking table: checking the data table

Closing table: flushing table data to disk

Locked: the record is locked by another query

Sending Data: processing Select query while sending the results to the client

Full-duplex: ability to send and receive data at the same time, such as making phone calls.

Half-duplex: at some point in time, data is either sent or received, not at the same time. Such as early walkie-talkies.

Simplex: only data can be sent or received. For example, an one-way street

Communication mechanism:

Thread status: show processlist; / / View the thread information that the user is running. Root users can view all threads, while other users can only see their own.

Query cache (Cache&Buffer), which is a place where queries can be optimized in MySQL. If query caching is turned on and the exact same SQL statement is queried during the query cache, the query results are returned directly to the client; if the query cache is not opened or the exact same SQL statement is not queried, the parser parses syntax and semantics and generates a "parsing tree".

Query statements use SQL_NO_CACHE

The result of the query is greater than the query_cache_limit setting

There are some uncertain parameters in the query, such as now ()

Cache the results of a Select query and SQL statements

When executing a Select query, query the cache first to determine whether there is an available recordset and whether the requirements are exactly the same (including parameter values), so as to match the cache data hit

The following SQL cannot be cached even if query caching is turned on:

Show variables like'% query_cache%'; / / check whether query caching is enabled, space size, limits, etc.

Show status like 'Qcache%'; / / View more detailed cache parameters, available cache space, cache blocks, cache size, etc.

The parser (Parser) parses the SQL sent by the client to generate a "parsing tree". The preprocessor further checks whether the parse tree is legal according to some MySQL rules, for example, it checks the existence of data tables and data columns, parses names and aliases to see if they are ambiguous, and finally generates a new parse tree.

The query optimizer (Optimizer) generates the optimal execution plan based on the parse tree. MySQL uses many optimization strategies to generate optimal execution plans, which can be divided into two categories: static optimization (compile-time optimization) and dynamic optimization (run-time optimization).

MySQL sorts in queries first, and then uses dichotomy to find data. For example, where id in (2Jing 1jue 3) becomes in (1Jing 2pm 3).

If you use a limit query to get the data needed by limit, you will not continue to traverse the later data.

The InnoDB engine min function only needs to find the far left side of the index

The InnoDB engine max function only needs to find the far right side of the index

MyISAM engine count (*), which does not need to be calculated and returns directly

Change 5 and a > 5 to a > 5

< b and a=5 改成b>

5 and axi5

Adjust conditional location, etc., based on federated index

Equivalent transformation strategy

Optimize count, min, max and other functions

Terminate the query in advance

Optimization of in

The query execution engine is responsible for executing the SQL statement. At this time, the query execution engine will get the query results and return them to the client according to the storage engine type of the table in the SQL statement and the interaction between the corresponding API interface and the underlying storage engine cache or physical files. If query caching is enabled, the SQL statements and results will be completely saved in the query cache (Cache&Buffffer), and the results will be returned directly if the same SQL statements are executed later.

Select * from test where age > 10

Call the InnoDB engine API to get the first row of the table to determine whether the age value is 10. If not, skip it. If so, store the row in the result set.

Call the engine interface to take the "next row" and repeat the same judgment logic until the last row of the table is fetched.

The executor returns the recordset composed of all the rows that meet the criteria in the above traversal to the client as a result set.

If query caching is enabled, cache the query results first.

Too many results are returned, which is returned in incremental mode.

At the beginning of execution, you should first determine whether you have permission to execute a query on the table T. if not, an error without permission will be returned. (if it hits the query cache, permission verification will be done when the query cache returns the result. The query also calls precheck to verify permissions before the optimizer.

If you have permission, open the table and continue execution. When the table is opened, the executor uses the interface provided by the engine according to the engine definition of the table. The execution process of the actuator is as follows:

3. Mysql storage engine

Storage engine, which is located in the third layer of MySQL architecture, is responsible for the storage and extraction of data in MySQL. It is a subsystem dealing with files. It is a file access mechanism customized according to the abstract interface of file access layer provided by MySQL. This mechanism is called storage engine.

Using the show engines command, you can view the engine information supported by the current database.

Before version 5.5, the MyISAM storage engine was used by default, and the InnoDB storage engine was adopted from 5.5 onwards.

InnoDB: transaction support, commit, rollback and crash recovery capabilities, transaction security

MyISAM: does not support transactions and foreign keys, fast access speed

Memory: create tables with memory, access speed is very fast, because the data is in memory, and Hash indexes are used by default, but once closed, the data will be lost

Archive: archive type engine, which only supports insert and select statements

Csv: data is stored in CSV files. Due to file restrictions, all columns must be forced to specify not null. In addition, the CSV engine does not support indexes and partitions, so it is suitable for intermediate tables for data exchange.

BlackHole: black hole, only can not enter and exit, come in and disappear, all inserted data will not be saved.

Federated: tables in the remote MySQL database can be accessed. A local table that does not save data and accesses the contents of remote tables.

MRG_MyISAM: a combination of MyISAM tables, these MyISAM tables must have the same structure, the Merge table itself has no data, Merge operations can operate on a set of MyISAM tables

Comparison between InnoDB and MyISAM

Transactions and foreign keys

InnoDB supports transactions and foreign keys, has security and integrity, and is suitable for a large number of insert or update operations

MyISAM does not support transactions and foreign keys. It provides high-speed storage and retrieval, and is suitable for a large number of select query operations.

Locking mechanism

InnoDB supports row-level locks to lock specified records. Lock the implementation based on the index.

MyISAM supports table-level locks to lock the entire table.

Index structure

InnoDB uses clustered indexes (clustered indexes), where indexes and records are stored together, caching both indexes and records.

MyISAM uses nonclustered indexes (non-clustered indexes), where indexes and records are separated.

Concurrent processing capacity

MyISAM uses table locks, which will result in low write concurrency, no blocking between reads, and read-write blocking.

InnoDB read and write blocking can be related to isolation levels, and multi-version concurrency control (MVCC) can be used to support high concurrency

Storage file

The InnoDB table corresponds to two files, an .frm table structure file and an .ibd data file. Maximum support for 64TB in InnoDB table

The MyISAM table corresponds to three files, a .frm table structure file, a MYD table data file, and a .MYI index file. From

The default limit for MySQL5.0 starting is 256TB.

Applicable scenario

Transaction support is required (with good transaction characteristics)

Row-level locking is well adapted to high concurrency.

Scenarios where data is updated more frequently

High requirements for data consistency

Because of the large memory of hardware devices, the better cache ability of InnoDB can be used to improve memory utilization and reduce disk IO.

No transaction support is required (not supported)

Relatively low concurrency (locking mechanism issues)

There are relatively few data modifications, mainly reading.

The requirement of data consistency is not high

MyISAM

InnoDB

Summary

Do you need a transaction? Yes, InnoDB.

Are there concurrent modifications? Yes, InnoDB.

Do you pursue fast queries with few data modifications? Yes, MyISAM

In the vast majority of cases, InnoDB is recommended

How to choose the two engines?

InnoDB storage structure

Since MySQL version 5.5, InnoDB is used as the engine by default, which is good at handling transactions and has the feature of automatic crash recovery. The following is the official InnoDB engine architecture diagram, which is mainly divided into two parts: memory structure and disk structure.

InnoDB memory structure

The memory structure mainly includes four components: Buffer Pool, Change Buffer, Adaptive Hash Index and Log Buffer.

Buffer Pool: buffer pool, or BP for short. BP is based on Page pages, and the underlying layer of 16K Magi BP, by default, uses linked list data structure to manage Page. When InnoDB accesses table records and indexes, it is cached in Page pages, which can be used later to reduce disk IO operations and improve efficiency.

Show variables like'% innodb_page_size%'; / / View page page size

Show variables like'% innodb_old%'; / / View old list parameters in lru list

Show variables like'% innodb_buffer%'; / / View buffffer pool parameters

It is recommended that the innodb_buffer_pool_size be set to 60% of the total memory size, 80% of the total memory size, and can be set to more than one, so that cache contention can be avoided.

Ordinary LRU: end knockout method in which new data is added from the head of the linked list and eliminated from the end when space is freed

Modified LRU: the linked list is divided into two parts: new and old. When elements are added, they are not inserted from the header, but from the middle midpoint position. If the data is accessed quickly, then page will move to the head of the new list. If the data is not accessed, it will gradually move to the tail of old, waiting to be eliminated.

Whenever new page data is read into the buffer pool, the InnoDb engine determines whether there are free pages and whether it is sufficient, and if so, removes the free page from the free list list and puts it in the LRU list. If there are no free pages, the default pages of the LRU linked list will be eliminated according to the LRU algorithm, and the memory space will be allocated to the new pages.

Page can be divided into three types according to their status:

For the above three page types, InnoDB maintains and manages through three linked list structures:

Page management mechanism

Improved LRU algorithm maintenance

Buffer Pool configuration parameters

Free page: free page, not used

Clean page: page is used and the data has not been modified

Dirty page: dirty page, page is used, the data has been modified, and the data in the page is inconsistent with the data on disk.

Free list: indicates free buffer, manages free page

Flush list: indicates the buffer that needs to be flushed to disk, manages the dirty page, and the internal page is sorted by modification time. Dirty pages exist not only in the flush linked list, but also in the LRU linked list, but the two do not affect each other. The LRU linked list is responsible for managing the availability and placement of page, while the flush linked list is responsible for managing the flushing operation of dirty pages.

Lru list: indicates the buffer in use and manages clean page and dirty page. The buffer is based on midpoint, and the front linked list is called new list area, which stores frequently accessed data, accounting for 63%. The latter linked list is called old list area, which stores less data, accounting for 37%.

Change Buffer: write buffer, or CB for short. During the DML operation, if the BP does not have its corresponding Page data, the disk page is not immediately loaded into the buffer pool, but the CB records the buffer changes, and when the future data is read, the data is merged and restored to the BP.

ChangeBuffer occupies 25% of BufferPool space by default, and the maximum allowed is 50%. It can be adjusted according to the volume of read and write traffic. Parameter innodb_change_buffer_max_size

When updating a record, the record exists in BufferPool and is modified directly in BufferPool, a memory operation. If the record does not exist in BufferPool (not hit), a memory operation will be performed directly in ChangeBuffer without having to go to disk to query data and avoid a disk IO. The next time the record is queried, it will be read from disk, then read from ChangeBuffer and merged, and finally loaded into BufferPool.

Write buffer, which applies only to non-unique ordinary index pages

If the index setting is unique, InnoDB must do a uniqueness check when making changes, so you must query the disk and do an IO operation. The record is queried directly into the BufferPool and then modified in the buffer pool, not in the ChangeBuffer.

Adaptive Hash Index: adaptive hash index for optimizing queries against BP data. The InnoDB storage engine monitors the lookup of the table index, and if it is observed that the establishment of a hash index can lead to a speed increase, it builds a hash index, so it is called adaptive. The InnoDB storage engine automatically indexes certain pages based on the frequency and pattern of access.

Log Buffer: the log buffer is used to hold the data to be written to the log file (Redo/Undo) on disk, and the contents of the log buffer are flushed to the disk log file periodically. When the log buffer is full, it is automatically flushed to disk, and when a large transaction operation with BLOB or multiline updates is encountered, increasing the log buffer can save disk Imano.

0: write log files and flush operations every 1 second (write log files LogBuffer-> OScache, flush OScache-> disk files), and lose data for up to 1 second

1: transaction commit, write log file and flush disk immediately. Data will not be lost, but frequent IO operations will be performed.

2: commit the transaction, write the log file immediately, and flush the disk every 1 second

LogBuffer is mainly used to record InnoDB engine logs. Redo and Undo logs are generated during DML operations.

When the LogBuffer space is full, it will be automatically written to disk. The disk IO frequency can be reduced by increasing the innodb_log_buffer_size parameter.

The innodb_flush_log_at_trx_commit parameter controls the log refresh behavior. Default is 1.

InnoDB disk structure

InnoDB disks mainly consist of Tablespaces,InnoDB Data Dictionary,Doublewrite Buffer, Redo Log, and Undo Logs.

Table space (Tablespaces): used to store table structure and data. Tablespaces are divided into system tablespaces, independent tablespaces, general tablespaces, temporary tablespaces, Undo tablespaces and other types.

There are two categories: session temporary tablespaces and global temporary tablespace:

The undo tablespace consists of one or more Undo log files. Prior to MySQL version 5. 7, Undo occupied the System Tablespace shared area, and Undo was separated from System Tablespace since 5. 7.

The undo tablespace used by InnoDB is controlled by the innodb_undo_tablespaces configuration option, which defaults to 0. A parameter value of 0 means using system tablespaces ibdata1; greater than 0 means using undo tablespaces undo_001, undo_002, and so on.

Common tablespaces are shared tablespaces created by create tablespace syntax. Common tablespaces can be created in other tablespaces outside the mysql data directory, which can hold multiple tables and support all row formats.

On by default, an independent tablespace is a single tablespace that is created in its own data file, not in the system tablespace. When the innodb_file_per_table option is turned on, the table is created in the tablespace. Otherwise, innodb will be created in the system tablespace. Each tablespace is represented by an .ibd data file, which is created by default in the database directory. Table files for tablespaces support dynamic (dynamic) and compressed (commpressed) row formats.

Contains the InnoDB data dictionary, the storage area of Doublewrite Buffer,Change Buffer,Undo Logs. The system tablespace also contains table and index data created by any user in the system tablespace by default. A system tablespace is a shared tablespace because it is shared by multiple tables. The data file for this space is controlled by the parameter innodb_data_file_path, and the default value is ibdata1:12M:autoextend (file name ibdata1, 12MB, automatic extension).

CREATE TABLESPACE ts1 ADD DATAFILE ts1.ibd Engine=InnoDB; / / create table space ts1 CREATE TABLE T1 (C1 INT PRIMARY KEY) TABLESPACE ts1; / / add tables to ts1 table space

System tablespace (The System Tablespace)

Independent tablespace (File-Per-Table Tablespaces)

Common tablespaces (General Tablespaces)

Undo tablespaces (Undo Tablespaces)

Temporary tablespace (Temporary Tablespaces)

Session temporary tablespaces stores temporary tables created by users and temporary tables inside the disk.

Global temporary tablespace stores the rollback segment (rollback segments) of the user's temporary table. When the mysql server shuts down normally or terminates abnormally, the temporary tablespace is removed and recreated each time it is started.

Data dictionary (InnoDB Data Dictionary)

The InnoDB data dictionary consists of internal system tables that contain metadata for looking up objects such as tables, indexes, and table fields. Metadata is physically located in the InnoDB system tablespace. For historical reasons, the data dictionary metadata overlaps with the information stored in the InnoDB table metadata file (.frm file) to some extent.

Double write buffer (Doublewrite Buffer)

The innodb_flush_method parameter of MySQL controls the opening and writing mode of innodb data files and redo log. There are three values: fdatasync (the default), Olympus DSYNCpender direct. Setting O_DIRECT means that the data file write operation tells the operating system not to cache the data and not to use pre-read, writing directly from InnodbBuffer to the disk file.

The default fdatasync means to write to the operating system cache first, and then call the fsync () function to asynchronously brush the cache information of the data file and redo log.

Located in the system tablespace, it is a storage area. The data is stored in the Doublewrite buffer before the page page of the BufferPage is flushed to the actual location of the disk. If the operating system, storage subsystem, or mysqld process crashes during page page writing, InnoDB can find a good backup of the page from the Doublewrite buffer during crash recovery. In most cases, the double write buffer is enabled by default, and to disable the Doublewrite buffer, you can set innodb_doublewrite to 0. It is recommended that you set innodb_flush_method to O_DIRECT when using Doublewrite buffers.

Redo log (Redo Log)

The redo log is a disk-based data structure used to correct data written by incomplete transactions during crash recovery. MySQL writes to the redo log file in a circular manner, logging all changes to Buffer Pool in InnoDB. When there is an instance failure (such as a power outage), resulting in the data not being updated to the data file, the database must be restarted by redo to update the data to the data file. In the process of reading and writing transactions, redo log will be generated continuously. By default, the redo log is physically represented on disk by two files named ib_logfile0 and ib_logfile1.

Undo log (Undo Logs)

The undo log is a backup of modified data that is saved before the transaction starts and is used to roll back the transaction in exceptional cases. The undo log is a logical log and is recorded according to each line of record. The undo log exists in the system tablespace, undo tablespace, and temporary tablespace.

Evolution of new version structure

MySQL version 5.7

The Undo log tablespace is separated from the shared tablespace ibdata file, and the file size and number can be specified by the user when installing MySQL.

Added temporary temporary tablespace, which stores data from temporary tables or temporary query result sets.

The Buffer Pool size can be changed dynamically without the need to restart the database instance.

MySQL version 8.0

The data dictionary and Undo of the InnoDB table are completely separated from the shared tablespace ibdata. In the past, it was necessary to have the data dictionary in ibdata consistent with the data dictionary in the independent tablespace ibd file, but not in version 8.0.

Temporary temporary tablespaces can also be configured with multiple physical files, all of which are InnoDB storage engines and can create indexes, which speeds up processing.

Users can set up some tablespaces like Oracle databases, each tablespace corresponds to multiple physical files, each tablespace can be used by multiple tables, but a table can only be stored in one tablespace.

Doublewrite Buffer is also separated from the shared tablespace ibdata.

InnoDB thread model

IO Thread

Read thread: responsible for the read operation, loading data from disk into the cached page page. 4

Write thread: responsible for the write operation, flushing the cache dirty pages to disk. 4

Log thread: responsible for flushing the contents of the log buffer to disk. 1

Insert buffer thread: responsible for flushing the write buffer to disk. 1

In InnoDB, a large number of AIO (Async IO) are used to do read and write processing, which can greatly improve the performance of the database. In

There are 10 IO Thread in InnoDB, which are 4 write,4, read,1 insert buffer and 1 log thread respectively.

Purge Thread

After the transaction commits, the undo logs it uses will no longer be needed, so Purge Thread will need to recycle the allocated undo pages.

Show variables like'% innodb_purge_threads%'

Page Cleaner Thread

The function is to refresh the dirty data to the disk, and the corresponding redo log can be overwritten after the dirty data is brushed, which can synchronize the data and

To achieve the purpose of recycling redo log. Write thread threading is invoked.

Show variables like'% innodb_page_cleaners%'

Master Thread

Refresh dirty page data to disk

Merge write buffer data

Flush the log buffer

Delete useless undo pages

Flush the log buffer and brush it to disk

Merge the write buffer data and decide whether to operate according to the IO read and write pressure

Refresh dirty pages to disk and operate only when the proportion of dirty pages reaches 75% (innodb_max_dirty_pages_pct

Innodb_io_capacity)

Master thread is the main thread of InnoDB and is responsible for scheduling other threads with the highest priority. The function is to asynchronously refresh the data in the buffer pool to the disk to ensure the consistency of the data. Including: dirty page refresh (page cleaner thread), undo page collection (purge thread), redo log refresh (log thread), merge write buffer, etc. There are two main processes internally, which are processed every 1 second and 10 seconds, respectively.

Every 1 second operation:

Operation every 10 seconds:

InnoDB data file InnoDB file storage structure

InnoDB data file storage structure

Tablesapce tablespace for storing multiple ibd data files for storing records and indexes of tables. A file contains multiple segments.

Segment segment, which is used to manage multiple Extent, divided into data segment (Leaf node segment) and index segment (Non-leaf node)

Segment), rollback segment (Rollback segment). A table will have at least two segment, one for managing data and one for managing indexes. For every extra index created, there are two more segment.

The Extent section, which always contains 64 contiguous pages with a size of 1m. When the table space is insufficient, you need to allocate new page resources, which will not

One page at a time, and one area is allocated directly.

Page page, which is used to store multiple Row row records with a size of 16K. There are many page types, such as data page, undo page, system page, transaction data page, large BLOB object page.

Row row, containing the field value of the record, transaction ID (Trx id), scroll pointer (Roll pointer), field pointer (Field)

Pointers) and other information.

Divided into ibd data files-- > Segment (segment)-> Extent (section)-> Page (page)-> Row (line)

Page is the most basic unit of a file. No matter what type of page it is, it is composed of page header,page trailer and page body. As shown in the following figure

InnoDB file storage format

In general, if row_format is REDUNDANT, COMPACT, the file format is Antelope;; if row_format is DYNAMIC and COMPRESSED, the file format is Barracuda.

View the file format of the specified table through information_schema

Select * from information_schema.innodb_sys_tables

View through the SHOW TABLE STATUS command

File file format (File-Format)

Antelope: the previously unnamed, original InnoDB file format that supports two line formats: COMPACT and REDUNDANT,MySQL 5.6 and their previous versions default to Antelope.

Barracuda: new file format. It supports all line formats of InnoDB, including the new line formats: COMPRESSED and DYNAMIC.

In early versions of InnoDB, there was only one file format, and with the development of the InnoDB engine, new file formats emerged to support new features. Currently, InnoDB only supports two file formats: Antelope and Barracuda.

The innodb_file_format configuration parameter allows you to set the InnoDB file format, which was changed from the Antelope,5.7 version to Barracuda.

Row line format (Row_format)

The row format of a table determines how its rows are physically stored, which in turn affects the performance of queries and DML operations. If you accommodate more rows in a single page page, queries and index lookups can work faster, requiring less memory in the buffer pool, and less Ihand O required to write updates.

The InnoDB storage engine supports four row formats: REDUNDANT, COMPACT, DYNAMIC, and COMPRESSED.

The new DYNAMIC and COMPRESSED formats introduce features such as data compression, off-page storage of enhanced long-column data, and large index prefixes.

The data of each table is stored in several pages, and each page is stored in a B-tree structure.

If some field information is too long to be stored in the B-tree node, the space will be allocated separately, which is called the overflow page, and this field is called the off-page column.

The COMPRESSED row format provides the same storage features and functions as the DYNAMIC row format, but adds table and index

Support for data compression.

Using the DYNAMIC row format, InnoDB stores the long and variable length column values in the table entirely off the page, while the index record contains only a 20-byte pointer to the overflow page. Fixed-length fields greater than or equal to 768 bytes are encoded as variable-length fields. The DYNAMIC row format supports large index prefixes up to 3072 bytes, which can be controlled by the innodb_large_prefix parameter.

Compared with REDUNDANT line format, COMPACT line format reduces row storage space by about 20%, but at the cost of

CPU usage for some operations. If the system load is limited by cache hit ratio and disk speed, then COMPACT format

Maybe faster. If the system load is limited by the speed of CPU, the COMPACT format may be slower.

Using the REDUNDANT row format, the table stores the first 768 bytes of the variable-length column value in the index record of the B-tree node, and the rest

Is stored on the overflow page For fixed-length fields greater than or equal to 786 bytes, InnoDB is converted to variable-length fields so that

Can be stored off-page.

REDUNDANT line format

COMPACT line format

DYNAMIC line format

COMPRESSED line format

When creating tables and indexes, the file format is used for each InnoDB table data file (its name matches * .ibd). The way to change the file format is to recreate the table and its index, and the easiest way is to use the following command for each table you want to modify:

ALTER TABLE table name ROW_FORMAT= format type; introduction to Undo LogUndo Log

Undo: means to undo or cancel, returning an operation that specifies a certain state for the purpose of undoing an operation.

Undo Log: before the database transaction starts, the modified records are stored in the Undo log. When the transaction is rolled back or the database crashes, the Undo log can be used to undo the impact of the uncommitted transaction on the database.

Undo Log generation and destruction: Undo Log is generated before the transaction starts; when the transaction is committed, it is not deleted immediately. Undo log,innodb will put the corresponding undo log of the transaction into the delete list, which will then be recycled through the backstage thread purge thread. Undo Log is a logical log that records a process of change. For example, executing a delete,undolog records an insert;, executing a update,undolog records an opposite update.

Undo Log storage: undo log manages and records in a segmented manner. The innodb data file contains a rollback segment rollback segment with 1024 undo log segment internally. Undo log storage can be controlled by the following set of parameters.

# related parameter command show variables like'% innodb_undo%';Undo Log function

Realize the atomicity of the transaction

Undo Log is the product of realizing the atomicity of transactions. During a transaction, if an error occurs or the user executes the ROLLBACK statement, MySQL can use the backup in Undo Log to restore the data to the state it was before the transaction started.

Implement multi-version concurrency control (MVCC)

Undo Log is used in the MySQL InnoDB storage engine to implement multi-version concurrency control. Before the transaction is committed, Undo Log saves the pre-committed version data, and the data in Undo Log can be read by other concurrent transactions as snapshots of the old version of the data.

Transaction A manually opens the transaction and performs the update operation. First, the data hit by the update will be backed up to Undo Buffer.

Transaction B manually opens the transaction, performs the query operation, reads the Undo log data and returns, and performs snapshot reading.

Redo Log and Binlog

Redo Log log

Redo Log introduction

Redo: as the name implies, it means redo. For the purpose of restoring the operation, reproduce the operation when an accident occurs in the database.

Redo Log: refers to any data modified in a transaction, where the latest data is backed up and stored (Redo Log), which is called the redo log.

Generation and release of Redo Log: as the transaction operation executes, the Redo Log is generated, and the generated Redo Log is written to the Log Buffer when the transaction commits, not to the disk file as the transaction commits. After the dirty pages of the transaction operation are written to disk, the mission of Redo Log is complete, and the space occupied by Redo Log can be reused (overwritten).

How Redo Log works

Redo Log is the product of the persistence of transactions. Prevent dirty pages from being written to the table at the point in time when the failure occurs

When the MySQL service is restarted, it is redone according to Redo Log in the IBD file of the

The feature of data persistence.

Redo Log write mechanism

The content of the Redo Log file is written to the file in a sequential loop, and when it is full, it goes back to the first file to overwrite.

Write pos is the location of the current record, move back while writing, and then go back to the beginning of file 0 after writing to the end of the last file.

Checkpoint is the current location to be erased, and it is also pushed back and looped. Before erasing the record, update the record to the data file.

The space between write pos and checkpoint can be used to record new operations. If write pos catches up with checkpoint, it means that it is full, and you can't perform any new updates at this time. You have to stop and erase some records and push checkpoint.

Redo Log related configuration parameters

Show variables like'% innodb_log%'

Each InnoDB storage engine has at least 1 redo log filegroup (group), and each filegroup has at least 2 redo log files, default to ib_logfile0 and ib_logfile1. You can control Redo Log storage through the following set of parameters:

The policy for persistence of Redo Buffer to Redo Log can be set through Innodb_flush_log_at_trx_commit:

0: commit Redo buffer-> OS cache-> flush cache to disk per second, and transaction data within one second may be lost. The background Master thread performs the operation every 1 second.

1 (default): execute Redo Buffer-> OS cache-> flush cache to disk every transaction commit, which is the safest and worst performance way.

2: each transaction commit executes Redo Buffer-> OS cache, and then the background Master thread executes the OS cache-> flush cache to disk operation every 1 second.

It is generally recommended to choose a value of 2, because there is no data loss when MySQL hangs, and the number of transaction commits for 1 second will be lost if the whole server hangs.

According to.

Binlog log

Binlog record mode

ROW (row-based replication, RBR): each row of data is recorded in the log, and then the same data is modified on the slave side.

STATMENT (statement-based replication, SBR): each SQL of the modified data will be recorded in the Binlog of master, and when the slave is copied, the SQL process will be parsed to the same SQL executed on the original master side and executed again. Abbreviated as SQL statement replication.

MIXED (mixed-based replication, MBR): the mixed use of the above two modes generally uses STATEMENT mode to save binlog, and for operations that cannot be copied in STATEMENT mode, using ROW mode to save binlog,MySQL will select the write mode according to the SQL statement executed.

Advantages: it can clearly record the modification details of each row of data, and can fully realize master-slave data synchronization and data recovery.

Disadvantages: batch operations will generate a large number of logs, especially alter table will make logs soar.

Advantages: small log volume, reduced disk IO, improved storage and recovery speed

Disadvantages: in some cases, it can lead to inconsistency between master and slave data, such as last_insert_id (), now (), and so on.

Master-slave replication: enable the Binlog function in the master database, so that the master database can pass the Binlog to the slave database, and achieve data recovery to achieve master-slave data consistency after getting the Binlog from the database.

Data recovery: restore data through the mysqlbinlog tool.

Redo Log is a log unique to the InnoDB engine, and MySQL Server also has its own log, namely Binary log (binary log), or Binlog for short. Binlog is a binary log that records all database table structure changes and table data changes, and does not record operations such as SELECT and SHOW. The Binlog log is recorded as an event and contains the elapsed time that the statement was executed. There are two most important usage scenarios for opening Binlog logs.

The Binlog file name defaults to the hostname _ binlog- serial number format, such as oak_binlog-000001, or you can specify the name in the configuration file. There are three file recording modes: STATEMENT, ROW and MIXED. The specific meanings are as follows.

Binlog file structure

The binlog file of MySQL records various modification operations to the database, and the data structure used to indicate the modification operation is Log event. Different log event corresponding to different modification operations. The more commonly used log event are: Query event, Row event, Xid event and so on. The content of the binlog file is a collection of various Log event.

The Log event structure in the Binlog file is shown in the following figure:

Binlog write mechanism

According to the recording mode and operation to trigger event events to generate log event (event trigger execution mechanism)

A log event write buffer is generated during the execution of a transaction, and each transaction thread has a buffer Log Event stored in a binlog_cache_mngr data structure, in which there are two buffers, one is stmt_cache, which is used to store information that does not support transactions, and the other is trx_cache, which is used to store information that supports transactions.

The transaction writes the resulting log event to an external binlog file during the commit phase.

Different transactions write log event to the binlog file in a serial manner, so the log event information contained in one transaction is contiguous in the binlog file, and the log event of other transactions is not inserted in the middle.

Binlog file operation

According to the recording mode and operation to trigger event events to generate log event (event trigger execution mechanism)

Log event is written to a buffer during the execution of a transaction, and each transaction thread has a buffer

Log Event is stored in a binlog_cache_mngr data structure in which there are two buffers, one is stmt_cache, which is used to store information that does not support transactions, and the other is trx_cache, which is used to store information that supports transactions.

The transaction writes the resulting log event to an external binlog file during the commit phase.

Different transactions write log event to the binlog file in a serial manner, so a transaction contains log event information in the

The binlog file is contiguous and the log event of other transactions is not inserted in the middle.

Binlog file operation

Purge binary logs to 'mysqlbinlog.000001'; / / Delete the specified file purge binary logs before' 2020-04-28 00 reset master; / / erase all files before the specified time

You can start the automatic cleanup feature by setting the expire_logs_days parameter. The default value of 0 means that it is not enabled. A setting of 1 means that the binlog file will be deleted automatically after 1 day.

/ / resume mysqlbinlog at the specified time-- start-datetime= "2020-04-25 18:00:00"-- stop- datetime= "2020-04-26 00:00:00" mysqlbinlog.000002 | mysql-uroot-p1234 / / restore mysqlbinlog-- start-position=154-- stop-position=957 mysqlbinlog.000002 by event location number | mysql-uroot-p1234

Mysqldump: back up all database data on a regular basis. Mysqlbinlog can do incremental backup and restore operations.

Mysqlbinlog "file name" mysqlbinlog "file name" > "test.sql" show binary logs; / / equivalent to show master logs; show master status; show binlog events; show binlog events in' mysqlbinlog.000001';set global log_bin = mysqllogbin; ERROR 1238 (HY000): Variable 'log_bin' is a read only variable

You need to modify the my.cnf or my.ini configuration file, add log_bin=mysql_bin_log under [mysqld], and restart the MySQL service.

# log-bin=ON # log-bin-basename=mysqlbinlog binlog-format=ROW log-bin=mysqlbinlogshow variables like 'log_bin'

Binlog status View

Enable the Binlog function

Use the show binlog events command

Use the mysqlbinlog command

Using binlog to recover data

Delete Binlog Fil

The difference between Redo Log and Binlog

Redo Log belongs to the InnoDB engine function, while Binlog belongs to the MySQL Server built-in function and is recorded as a binary file.

Redo Log is a physical log, which records the update status content of the data page. Binlog is a logical log that records the update process.

Redo Log log is a circular write, the log space size is fixed, Binlog is an additional write, write one after the next, will not be overwritten.

Redo Log is used for automatic recovery of transaction data after abnormal server downtime, and Binlog can be used as master-slave replication and data recovery. Binlog does not have automatic crash-safe capability.

At this point, I believe you have a deeper understanding of "what is the principle of mysql architecture". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.