Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the uses of MySQL statements

2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "what are the uses of MySQL sentences". In daily operation, I believe that many people have doubts about the usage of MySQL sentences. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts about "what is the usage of MySQL sentences?" Next, please follow the editor to study!

1. SQL statement execution process

Generally speaking, MySQL can be divided into two parts: Server layer and storage engine layer.

Server layer:

Connector: TCP handshake after the server to verify the login user identity, A user created a connection, the administrator modified the A user rights will not affect the created link permissions, must be re-logged in.

Query cache: the storage location of the results after the query has been cancelled later on in the MySQL8.0 version, because the query cache expires too frequently and the loss outweighs the gain.

Parser: according to the grammar rules, determine whether the SQL statement you enter satisfies the MySQL syntax.

Optimizer: a variety of execution strategies can achieve the goal, and the system automatically selects the best for execution.

Executor: determine whether you have permission and submit the final task to the storage engine.

Storage engine layer

Responsible for data storage and extraction. Its architecture mode is plug-in, supporting multiple storage engines such as InnoDB, MyISAM, Memory and so on. The most commonly used storage engine today is InnoDB, which has been the default storage engine since MySQL version 5.5.5 (and this is often used).

SQL execution order

2 、 BinLog 、 RedoLog 、 UndoLog

BinLog

BinLog is a binary log that records all database table structure changes (such as create, alter table) and table data modifications (insert, update, delete). BinLog files are used for master-slave database synchronization. There are three modes for BinLog log files.

STATEMENT mode

Content: binlog records only sql statements that may cause data changes

Advantages: in this mode, because no actual data is recorded, the log volume and IO consumption are very low, and the performance is optimal.

Disadvantages: but some operations are not certain, such as the uuid () function will randomly generate a unique identity, when relying on binlog playback, the data generated by this operation must be different from the original data, which may cause unpredictable consequences.

ROW mode

Content: in this mode, binlog records the source data and modified target data for each operation, which is required by StreamSets.

Advantages: it can be restored with absolute precision, thus ensuring the security and reliability of the data, and the replication and data recovery process can be carried out concurrently.

Disadvantages: the disadvantage is that the binlog volume will be very large, at the same time, for the operation of modifying a large number of records and large field length, the performance consumption will be very serious. Special instructions are also needed to read the data when reading.

MIXED mode

Content: it is a mixture of the above two modes of STATEMENT and ROW.

Details: for most operations, STATEMENT is used to record binlog, and only the following operations are implemented using ROW: the storage engine of the table is NDB, uncertain functions such as uuid () are used, insert delay statements are used, and temporary tables are used.

Master-slave synchronization process:

1. The primary node must enable binary logging to record any events that modify the database data.

2. Start a thread (I Thread) from the slave node to act as the client of mysql and request the events in the binary log file of the master node through the mysql protocol.

3. The master node starts a thread (dump Thread) to check the events in its own binary log and compare it with the location requested by the other party. If there is no request location parameter, the master node will send the first event in the first log file to the slave node one by one.

4. Receive the data sent by the master node from the node and place it in the relay log (Relay log) file. And record which location inside the binary log file of the request to the primary node (there will be multiple binaries in the primary node).

5. Start another thread (sql Thread) from the node, read out the events in Relay log, and execute it again locally.

The default replication mode of mysql is asynchronous, and it has the ability to replicate in parallel. After the master database sends the log to the slave library, it will cause a problem, that is, if the master database fails and the slave database fails, the log will be lost after it is upgraded from the library to the master database. This gives rise to two concepts.

Full synchronous replication

After the master library is written to binlog, the log is forced to be synchronized to the slave library, and all the slave libraries are returned to the client after execution, but it is obvious that the performance will be seriously affected in this way.

Semi-synchronous replication

The logic of semi-synchronous replication is as follows: after the log is successfully written to the slave database, the ACK is returned to the master database to confirm it. The master database receives at least one confirmation from the slave library, and the write operation is considered complete.

It can also be extended to the master / slave delay caused by different master / slave configurations, large master database transactions, excessive slave library pressure, network shocks, etc., how to avoid this problem? When switching between active and standby, do you use the reliability first principle or the availability priority principle? How to judge the main library Crash? How to avoid the replication of the active and standby cycle in the case of mutual active and standby? How to recover correctly after being deleted? (⊙ o ⊙). It feels like it's getting more and more involved in DBA's work. Picture

RedoLog

You can first understand it through the following demo:

Keeping accounts for meals can be written down in the account book or on the chalk board. There are generally two ways to pay someone on credit or to repay their debts:

1. Turn out the account book directly and add or deduct the credit account this time.

2. Write down the account on the pink board first, and then check the account book after closing.

Choose the latter when the business is busy, because the former is too troublesome. You have to find the person's total credit information in the dense records, find it, then come up with the abacus, and finally write the results back on the ledger.

Similarly, in MySQL, if every update operation needs to be written to disk, and then the disk has to find the corresponding record, and then update, the whole process IO cost, search cost is very high. The whole process of matching the pink board and the account book is that MySQL uses Write-Ahead Logging technology, and its key point is to write the log first and then write the disk. At this time, the account book = BinLog, pink board = RedoLog.

1. When the record is updated, the InnoDB engine will first write the record to RedoLog (pink board) and update the memory. At the same time, the InnoDB engine updates this operation record to disk when it is idle.

2. If there are too many updates that cannot be processed by RedoLog, you need to write part of the RedoLog data to disk, and then erase part of the RedoLog data. RedoLog is similar to a turntable.

RedoLog has write pos and checkpoint.

Write pos: is the location of the current record, move back while writing, and then go back to the beginning of document 0 after writing to the end of document 3.

Check point: is the current location to be erased, which is also pushed back and looped. Update the record to the data file before erasing the record.

Between write pos and check point is the empty part of the powder board that can be used to record new operations. If write pos catches up with checkpoint, it means that the pink board is full, and you can't perform any new updates at this time. You have to stop and erase some records and push checkpoint.

With redo log,InnoDB, you can ensure that even if the database is abnormally restarted, the previously submitted records will not be lost, which is called crash-safe.

Picture redolog two-phase submission: in order to make the logic between binlog and redolog logs consistent. The submission process is roughly as follows:

1 prepare stage-> 2 write binlog-> 3 commit

When crashing before 2, there is no commit after restart and recovery, rollback. Backup restore: no binlog. Consistent

When crashing before 3, restart recovery finds that although there is no commit, it satisfies the integrity of prepare and binlog, so it will automatically commit after restart. Backup: there is binlog. Consistent

The difference between binlog and redolog:

Redo log is specific to the InnoDB engine; binlog is implemented at the Server layer of MySQL and can be used by all engines.

Redo log is a physical log that records changes made on a data page; binlog is a logical log that records the original logic of the statement, such as adding 1 to the c field of the line ID=2.

Redo log is written in a loop and the fixed space is used up; binlog can be appended to write. Append write means that when the binlog file is written to a certain size, it will switch to the next one and will not overwrite the previous log.

UndoLog

UndoLog is generally a logical log, which is mainly divided into two types:

Insert undo log

Represents the undo log generated by a transaction when a new insert record is created, which is needed only when the transaction is rolled back and can be discarded immediately after the transaction is committed

Update undo log

The undo log; generated by a transaction during update or delete is needed not only when the transaction is rolled back, but also when the snapshot is read, so it cannot be deleted casually. Only when the fast read or transaction rollback does not involve the log, the corresponding log will be cleared uniformly by the purge thread.

3. Index in MySQL

Common models of indexes are hash tables, ordered arrays, and search trees.

Hash table: a structure that stores data in KV, which is only suitable for equivalent queries, not range queries.

Ordered array: only suitable for static storage engine, it is troublesome when it comes to insertion. You can refer to ArrayList in Java.

Search tree: stores data according to the binary tree in the data structure, but at this time it is an N-tree (B+ tree). It is widely used in storage engine layer.

The advantage of picture B + tree over B tree is:

The non-leaf node of the B+ tree stores only the index and can store more. The B + tree is fatter and has less IO than the B tree.

The B+ tree leaf node is managed before and after, which makes it more convenient to query the scope. At the same time, the results are all in the leaf node, and the query efficiency is stable.

The B + tree is more advantageous to scan the data and can avoid the backtracking scan of the B tree.

Advantages of the index:

1. A unique index can guarantee the uniqueness of each row of data.

2. Improve the query speed

3. The connection between accelerometer and meter

4. Significantly reduce the time of grouping and sorting in the query

5. By using the index, the optimization hidden device can be used in the process of query to improve the performance of the system.

Disadvantages of the index:

1. Both creation and maintenance take time

2. When creating an index, you need to lock the table, which may affect other data operations at the same time.

3. The index needs the space of the disk for storage, and the disk occupation is very fast.

4. When CRUD the data in the table, the maintenance of the index will also be triggered, and it takes time to maintain the index, which may degrade the performance of the data operation.

The principles of index design should not be:

1. The more indexes, the better. There are too many indexes and it takes time and space to maintain them.

2. For frequently updated data, it is not suitable to build an index.

3. There is no need to establish an index for tables with small amount of data.

It should be:

1. It is recommended to generate an index for columns with low repetition rate. Because there is less duplicate data, the index tree query is more efficient, and the larger the equivalent cardinality, the better.

2. The data is unique, so it is recommended to generate a unique index. At the database level, ensure the correctness of the data

3. It is recommended to generate indexes for columns with frequent group by and order by. Can greatly improve the efficiency of grouping and sorting

4. It is recommended to generate an index for fields that are often used for query conditions. Faster through index query

The scenario of index failure

1. Fuzzy search: left fuzzy or full fuzzy will lead to index invalidation, such as'% a 'and'% a%'. But right blur can be indexed, such as'a%'.

2. Implicit type conversion: for example, select * from t where name= xxx. Name is a string type without quotation marks, so it is implicitly converted by MySQL, so it invalidates the index. 3. When there is an or in the statement: for example, select * from t where name='sw' or age=14

4. The leftmost prefix match that does not match the federated index: (a where B Magi C) for the federated index, you only have C or B or only BMagne C

Knowledge points about the index:

Primary key index: the leaf node of the primary key index stores the entire row of data information. In InnoDB, a primary key index is also known as a clustered index (clustered index). Self-increment of primary keys cannot guarantee complete self-increment. Unique key conflicts, transaction rollbacks and so on may lead to discontinuity.

Unique index: an index generated as a unique column that does not allow duplicate values but allows null values (NULL)

General index and unique index query performance: InnoDB data is read and written by data page, default per page 16KB, so there is little difference between the two kinds of index query data performance.

Change buffer: a normal index is used to speed up the update process. If the updated field is in the cache, if it is a normal index, it can be updated directly. If it is a unique index, you need to read all the data into memory to ensure that it does not violate uniqueness, so try to use a normal index.

Non-primary key index: the leaf node content of a non-primary key index is the value of the primary key. In InnoDB, a non-primary key index is also called a secondary index (secondary index)

Back to the table: first scan the row of the data through the database index, and then retrieve the data not provided in the index through the row primary key id, that is, the query based on the non-primary key index needs to scan one more index tree.

Override index: if an index contains (or overrides) the values of all the fields that need to be queried, we call it an override index.

Federated index: compared to a single-column index, a combined index is an index built with a combination of multiple columns, with a maximum of 16 at a time.

Leftmost prefix principle: a combined index built on multiple fields at the same time (there is an order, ABC,ACB is two completely different joint indexes) take the joint index (aformab) as an example, the establishment of such an index is equivalent to the establishment of three indexes: index a, ab and abc. In addition, the combined index is actually one index, not really creating multiple indexes, but the effect is equivalent to generating multiple indexes.

Index push-down: MySQL 5.6 introduces index push-down optimization, which can first judge the fields contained in the index in the process of index traversal, filter out records that do not meet the conditions, and reduce the number of words returned to the table.

Index maintenance: the B+ tree involves page splitting and page merging in order to maintain index order. Page space utilization should be considered when adding or deleting data.

Self-increasing primary key: a self-increasing primary key that has nothing to do with business is generally established, which does not trigger the split of leaf nodes.

Delayed association: by using the overlay index query to return the required primary key, and then associate the original table according to the primary key to get the required data.

InnoDB storage: * .frm file is a definition file, which defines what a database table is. * .ibd file is the index of the table, the data storage file, that is, all the index trees of the table, and all row record data are stored in this file.

MyISAM storage: * .frm file is a definition file, which defines what a database table is. * .MYD files are files where MyISAM stores all row data of the engine table. The .MYI file stores the MyISAM files that store data related to the index of the engine table. Under the MyISAM engine, table data and table index data are stored separately.

MyISAM query: under MyISAM, both primary and secondary key indexes are non-clustered indexes. Whether the query takes the primary key index or the non-primary key index, the address of the destination data is obtained at the leaf node, and the destination data can be found in the data file through this address.

PS:InnoDB supports clustered indexes, while MyISAM does not support clustered indexes.

4. SQL transaction isolation level

Four characteristics of ACID

Atomicity: put multiple operations in a single transaction to ensure that they are either successful or unsuccessful

Consistency: understood as a string of programs that manipulate the data, it will not have a bad effect on the data, such as generating or disappearing out of thin air.

Isolation (Isolation, also known as independence): isolation means that multiple transactions do not interfere with each other, even in the case of concurrent transactions, they are just two things that do not intersect and do not affect each other; of course, in implementation, it does not necessarily need such complete isolation, that is, it does not necessarily need such non-interference with each other, and sometimes partial interference is allowed. So MySQL can support four kinds of transaction isolation.

Durability: when an operation is completed, the result is this, and the operation is persisted to the log record

The difference between C in PS:ACID and C in CAP Theorem

ACID C emphasizes that when dealing with a single database, to ensure the integrity and correctness of the data, the data will not disappear and increase out of thin air. In CAP theory, C refers to read-write consistency of multiple backups of a data.

Data problems that may occur in transaction operations

1. Dirty read: B transaction change data has not been committed, A transaction has been seen and used. If transaction B is rolled back, transaction An is wrong.

2. Non-repeatable reading (non-repeatable read): the key point of non-repeatable reading is to modify: under the same conditions, the data you have read and read again will find that the values are different. You only need to lock the records that meet the conditions.

3. Phantom read: transaction A first modifies the status field of all records in a table to be processed and uncommitted; transaction B also adds an unprocessed record and commits it at this time; transaction A then queries the record, only to find that an unprocessed record causes phantom reading, which only refers to newly inserted rows. Illusions can cause semantic problems and data consistency problems.

4. Under the repeatable RR isolation level, ordinary queries are snapshot reads and will not see the data inserted by other transactions. Therefore, the illusion will only appear under the current reading. This problem should be resolved with a gap lock.

Before you talk about the level of isolation, you should first know that the more tightly isolated you are, the less efficient you will be. So most of the time, we have to find a balance between the two. The transaction isolation level of the SQL standard is as follows:

The top-to-bottom mode of the above figure will lead to the decrease of the parallel performance and the improvement of security of the system.

Read uncommitted: transactions that have been changed by others have not yet been committed, and I can read them in my transactions.

Read committed (Oracle default): a transaction that someone else has changed the data has been committed before I can read it in my transaction.

Readable (MySQL default): transactions that have been altered by others have been committed and I do not read them in my transaction to ensure consistency in repeated reads.

Serial: no one else can change the data before my transaction is committed.

Standards and implementation: all of the above are standards for transactions, but each database has a different implementation. For example, MySQL InnDB defaults to the RR level, but there are no phantom reads. Because when transaction A updates a field of all records, transaction A will acquire a table lock on the table, because transaction A has not yet committed, so the lock acquired by transaction A has not been released, and transaction B inserts a new record in the table, which will cause the insert operation to be blocked because the lock for the table cannot be obtained. Transaction B can proceed only after transaction A commits the transaction and releases the lock. So it can be said that MySQL's RR-level isolation has been achieved to solve dirty reading, non-repeatable reading and phantom reading.

5. Locks in MySQL

Both the concurrent programming of Java and the concurrent operation of database will involve locks. Researchers have introduced the design idea of pessimistic lock and optimistic lock.

Pessimistic lock:

Advantages: it is suitable for use in a concurrent environment with more writes and less reads. Although it can not maintain very high performance, it can achieve data security under the premise that optimistic locks can not achieve better performance.

Disadvantages: locking will increase system overhead, although it can ensure the security of data, but the throughput of data processing is low, so it is not suitable to be used in situations with less reading and writing.

Optimistic lock:

Advantages: in the concurrent scenario with more reads and less writes, it can avoid the cost of database locking and improve the response performance of the DAO layer. In many cases, ORM tools have implementations with optimistic locks, so these methods do not necessarily need to be implemented artificially.

Disadvantages: in the concurrent scenario of more writes and less reads, that is, in the case of fierce competition for write operations, it will cause CAS to retry many times, resulting in high conflict frequency and higher overhead than pessimistic locks.

Implementation: optimistic locking at the database level is actually similar to the idea of CAS, and it can also be implemented through data version number or timestamp.

There are three main database concurrency scenarios:

Read-read: there are no problems and no concurrency control

Read-write: have isolation problems, may encounter dirty reading, phantom reading, non-repeatable reading

Write-write: there may be problems of update loss, such as the first type of update loss and the second type of update loss.

There are two types of update loss issues:

The first type of update loss: transaction A's transaction rollback overwrites transaction B's committed results the second type of update is lost: transaction A's commit overrides transaction B's committed results

In order to implement the idea of lock reasonably, various locks are introduced into MySQL:

Lock classification

MySQL supports three levels of locking, which are

Table level locking

The largest lock granularity in MySQL, and both MYISAM and INNODB, the most commonly used locks, support table-level locking.

Page level locking

It is a kind of lock in MySQL whose locking granularity is between row-level lock and table-level lock. The speed of table-level lock is fast, but there are many conflicts and row-level conflicts, but the speed is slow. So take the eclectic page level and lock the adjacent set of records one at a time.

Row level locking

The finest-grained lock in Mysql, indicating that locks are applied only to the rows of the current operation. Row-level locks can greatly reduce conflicts in database operations. The granularity of locking is the smallest, but the overhead of locking is also maximum. row-level locks are not necessarily better than table-level locks: the finer the granularity of locks, the higher the cost. Compared with table-level locks that directly lock the head of the table, row-level locks have to scan to find the corresponding rows to lock them. In fact, the cost is relatively high, so table locks and row locks have their own advantages.

Locks in MyISAM

Although MySQL supports table, page, and row locking, the MyISAM storage engine only supports table locks. Therefore, the locking cost of MyISAM is relatively low, but the concurrency performance of data operations is relatively low. However, if all writes are tail inserts, it can still support a certain degree of read-write concurrency.

From the locks supported by MyISAM, it can also be seen that MyISAM is a database engine that supports read concurrency, but does not support general read-write concurrency and write-write concurrency, so it is more suitable for applications with more reading and less writing, and is less used in general engineering.

Locks in InnoDB

There are too many locks supported in this mode, as follows:

Shared and exclusive locks (Shared and Exclusive Locks)

Intention lock (Intention Locks)

Record lock (Record Locks)

Gap lock (Gap Locks)

Critical lock (Next-Key Locks)

Insert intention lock (Insert Intention Locks)

Primary key self-increment lock (AUTO-INC Locks)

Spatial index assertion lock (Predicate Locks for Spatial Indexes)

Take a chestnut, such as the shared lock and exclusive lock in the row lock: lock in share modle shared read lock:

To make sure that the data you find is not being modified by other transactions, that is, to make sure that the data you find is up-to-date and that no one else is allowed to modify the data. But you may not be able to modify the data yourself, because it is possible that other transactions have locked the data in the same way as in share mode. If not timely commit or rollback may also cause a large number of transactions to wait.

For update exclusive write lock:

In order to make sure that the data you have found is up-to-date, and you are only allowed to modify it by yourself, you need to use for update. Is equivalent to a update statement. In the case of busy business, if the transaction does not have timely commit or rollback, it may cause other transactions to wait for a long time, thus affecting the efficiency of the concurrent use of the database.

Gap Lock gap lock:

1. Row locks can only lock rows, which cannot be solved if you insert data into the gap between records, so MySQL introduces Gap Lock. The gap lock is open left and right. There is no conflict between clearance locks.

2. Gap lock and row lock are called NextKeyLock, and each NextKeyLock is a front open and back closed interval.

The principle of gap lock (forget that after learning):

1. The basic unit of locking is NextKeyLock, which is the front opening and closing interval.

2. Only the objects accessed during the search process will be locked.

3. The equivalent query on the index, when the unique index is locked, the NextKeyLock is reduced to a row lock.

4. For the equivalent query on the index, when it traverses to the right and the last value does not satisfy the equivalence condition, the NextKeyLock degenerates to a gap lock.

5. The range query on the unique index accesses the first value that does not meet the condition.

6 、 MVCC

MVCC:

1. The full name is Multi-Version Concurrency Control, that is, multi-version concurrency control. MVCC is a concurrency control concept that maintains multiple versions of a data so that there is no conflict between read and write operations.

2. The main purpose of MVCC in MySQL InnoDB is to improve the concurrency performance of the database, to deal with read-write conflicts in a better way, so that even if there are read-write conflicts, it can be unlocked and non-blocking.

Current read and snapshot read under MySQL InnoDB

Current read

1. Operations such as select lock in share mode (shared lock), select for update, update, insert and delete (exclusive lock) are all current reads, that is, they read the latest version of the record, and make sure that other concurrent transactions cannot modify the current record and lock the read record.

2. The current reading can be regarded as the concrete function realization of pessimistic lock.

Snapshot read

1. Select without lock is snapshot read, that is, non-blocking read without lock. The premise of snapshot read is that the isolation level is not serial level, and snapshot read at serial level will degenerate to current read. The reason for snapshot reading is based on the consideration of improving concurrency performance. The implementation of snapshot reading is based on multi-version concurrency control, that is, MVCC. It can be considered that MVCC is a variant of row lock, but in many cases, it avoids locking operations and reduces overhead. Since it is based on multiple versions, snapshot reading may not necessarily read the latest version of the data, but may be the previous historical version.

2. Snapshot reading is the concrete non-blocking read function implementation of MVCC in MySQL. The purpose of MVCC is to realize read-write conflict without lock and improve concurrent read-write performance, and this read refers to snapshot reading.

3. Snapshot reading is one of the specific non-blocking reading functions that MySQL implements for us in the MVCC ideal model.

Because the boss is not satisfied that the database only uses pessimistic locks to solve read-write conflicts, and proposes MVCC, we can form two combinations:

MVCC + pessimistic lock: MVCC solves read and write conflicts, pessimistic lock solves write conflicts

MVCC + optimistic lock: MVCC solves read-write conflicts, optimistic locks resolve write-write conflicts

The realization principle of MVCC

The implementation principle of MVCC mainly depends on four implicit fields in the record, undo log and Consistent Read View.

Four implicit fields:

DB_TRX_ID:

6byte, recently modified (modified / inserted) transaction ID: record creates this record / the transaction ID that last modified the record

DB_ROLL_PTR

7byte, roll back the pointer to the previous version of this record (stored in rollback segment)

DB_ROW_ID

6byte, the implicit self-increasing ID (hidden primary key). If the data table does not have a primary key, InnoDB will automatically generate a clustered index as DB_ROW_ID.

FLAG

A delete flag hidden field, since the record is updated or deleted does not mean that it is actually deleted, but the deletion flag has changed.

The modification of a record by a transaction will cause the undo log of the record to become a record version linear list (linked list). The head of the undo log is the latest old record, and the end of the chain is the earliest old record.

Undo log: this knowledge point has been mentioned above, the essence that is helpful to MVCC is that update undo log,undo log actually exists in the old record chain in rollback segment.

The consistent read view Consistent Read View:Read View is the read view (Read View) produced during the snapshot read operation of the transaction. At the moment of the snapshot read performed by the transaction, a current snapshot of the database system is generated and the ID of the current active transaction of the system is recorded and maintained (each transaction in the InnoDB has a unique transaction ID, called transaction id. It is applied to the transaction system of InnoDB at the beginning of the transaction and is strictly incremented in the order of application. Take this ID and compare it with the ID in the record for selective display. Here's a general idea.

You can simply understand that MVCC adds two hidden fields to each row, and the two fields hold the row's current transaction ID and the row's delete transaction ID.

When insert:

InnoDB saves the current system version number as the version number for each newly inserted row.

When select:

1. InnoDB will only look for data rows whose version is earlier than the current transaction version (that is, the system version number of the row select word from words order by rand () limit 3

Directly use order by rand (), explain this statement found that the need for Using temporary and Using filesort, the execution of the query is often relatively expensive. Therefore, it is necessary to avoid this way of writing when designing.

Mysql > select count (*) into @ C from t; set @ Y1 = floor (@ C * rand ()); set @ Y2 = floor (@ C * rand ()); set @ Y3 = floor (@ C * rand ()); select * from t limit @ Y1Magol 1; select * from t limit @ Y2Magne1; select * from t limit @ Y3jue 1

In this way, the generation of temporary tables and sorting can be avoided, and the final number of query rows = C + (Y1room1) + (Y2room1) + (Y3room1)

Comparison of exist and in:

1. When in queries, first query the table of the subquery, then make a Cartesian product between the inner table and the outer table, and then filter according to the conditions.

2. The sub-query uses exists, which will advance the row main query and loop each row of data into the sub-query to verify the existence of the sub-query and filter out the overall returned data.

3. The size of the two tables is the same, and there is little difference between in and exists. The inner table is large, the efficiency of using exists is higher, while the inner table is small, the efficiency of using in is higher.

Not exists is faster than not in.

10. MySQL optimization

SQL optimization is mainly divided into four directions: SQL statement and index, table structure, system configuration, hardware.

The general optimization idea is to maximize the use of indexes, avoid full table scans as much as possible, and reduce queries for invalid data:

1. Reduce data access: set a reasonable field type, enable compression, and reduce disk IO through index access.

2. Return less data: return only the required fields and data paging processing, reducing disk IO and network IO.

3. Reduce the number of interactions: batch DML operations, function storage, etc. reduce the number of data connections.

4. Reduce server CPU overhead: reduce database sorting operations and full table queries as much as possible, and reduce CPU memory consumption.

5. Sub-table partitioning: using table partitioning can increase parallel operations and make greater use of CPU resources.

A general example of SQL statement optimization is:

1. Establish an overlay index reasonably: it can effectively reduce back to the table.

2. Union,or,in can hit the index. It is recommended to use in.

3. Negative conditional (! =, not in, not exists, not like, etc.) indexes will not use indexes, it is recommended to use in.

4. Operating on a column or using a function will invalidate the index and perform a full table scan

5. Beware of implicit type conversion, the original string using an integer will trigger the CAST function and cause the index to fail. If the original int uses a string, it will be indexed.

6. It is not recommended to use% prefix fuzzy query.

7. When querying with multiple tables, the small table is in front of the query and the large table is behind. In MySQL, the table association query after from is executed from left to right (the opposite of Oracle), and the first table involves a full table scan.

8. Adjust the connection order in the Where sentence. MySQL parses the where clause from left to right and from top to bottom. According to this principle, the condition of filtering more data should be put forward and the result set should be reduced as quickly as possible.

General idea of SQL tuning:

1. Use the slow query log to locate the sql that needs to be optimized.

2. Use the explain execution plan to view index usage

3. Focus on (generally speaking, the index problem can be found based on these four columns):

Key (check to see if an index is used)

Key_len (check if the index is fully used)

Type (View Index Type)

Extra (view additional information: sorting, temporary table, where condition is false, etc.)

4. Optimize sql 5 according to the index problem identified in the previous step, and then go back to step 2

Table structure optimization:

1. Try to use TINYINT, SMALLINT, MEDIUM_INT as integer types instead of INT, and add UNSIGNED if non-negative.

2. The length of VARCHAR allocates only the space that is really needed.

3. Try to use TIMESTAMP instead of DATETIME.

4. Do not have too many fields in a single table. It is recommended that it be less than 20.

5. Avoid using NULL field, it is difficult to query and optimize and take up extra index space. The string defaults to''.

Read-write separation:

Write only on the master server and read only on the slave server. Corresponding to the database cluster is generally one master, one slave, one master and multiple slaves. The business server writes all the operations that need to be written to the master database, and the read operations are queried from the database. The master library synchronizes the data to the slave database to ensure data consistency. Generally, there are two ways to realize the separation of read and write: code encapsulation and database middleware.

Sub-library sub-table: sub-library sub-table is divided into vertical and horizontal two ways, generally vertical first and then horizontal.

1. Vertical sub-library: the application is divided into several modules, such as order module, user module, commodity module, payment module and so on. In fact, it is the concept of micro-service.

2. Vertical sub-table: generally split the less commonly used fields from the fields with larger data.

3. Horizontal sub-table: which field is selected as the sub-table field according to the scenario, such as Taobao daily order 10 million, using userId as the sub-table field, data query supports orders in the last 6 months, and those more than 6 months are archived. Then the data volume in 6 months is 1.8 billion, divided into 1024 tables, each table stores 200W data, and hash (userId) 0 finds the corresponding table.

4. ID generator: distributed ID needs cross-database global unique query storage-retrieval data to ensure uniqueness and digital increment.

At present, the main popular sub-library and sub-table tools are Mycat and sharding-sphere.

TiDB: open source distributed database that combines the best features of traditional RDBMS and NoSQL. TiDB is compatible with MySQL, supports unlimited horizontal scaling, and has strong consistency and high availability. The goal of TiDB is to provide an one-stop solution for OLTP (Online Transactional Processing) and OLAP (Online Analytical Processing) scenarios. TiDB has the following core features

1. Support MySQL protocol (low development cost).

2. 100% support transactions (data consistency is simple and reliable).

3. Unlimited level expansion (no need to consider sub-database and sub-table) and non-stop service.

4. Mutual support between TiDB and MySQL.

5. Follow the jdbc principle, low learning cost, strong relationship, strong consistency, no need to worry about master-slave configuration, no need to consider sub-database and sub-table, and can be expanded seamlessly and dynamically.

Suitable for:

1. When the MySQL business of the original business encounters a single machine capacity or performance bottleneck, you can consider using TiDB to seamlessly replace MySQL.

2. Under the condition of large amount of data, MySQL complex query is very slow.

3. With a large amount of data, the data grows rapidly, which is close to the limit of stand-alone processing. We do not want to divide databases and tables or use database middleware and other Sharding solutions that are intrusive and constrained to the business.

4. Under a large amount of data, there is a need for high concurrent real-time writing, real-time query and real-time statistical analysis. 5. There are requirements for distributed transactions, 100% strong consistency of data in multiple data centers, and high availability of auto-failover.

Not suitable:

1. TiDB is not needed in scenarios that stand-alone MySQL can meet.

2. TiDB,TiDB is usually not used in scenarios with less than 5000w data entries. It is designed for large-scale data scenarios.

3. If the amount of data in your application is small (all data levels are less than 10 million lines), and there are no requirements such as high availability, strong consistency or multi-data center replication, then it is not suitable to use TiDB.

At this point, the study of "what are the uses of MySQL sentences" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report