In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains the "Analysis of MySQL Optimization ideas", the content of the explanation is simple and clear, easy to learn and understand, now please follow the editor's ideas slowly in depth, together to study and learn "Analysis MySQL Optimization ideas" bar!
Thinking angle
Up to now, database technology has gone through manual management stage, file system stage and database system stage.
In the early days when there was no software system, the manual management phase of manual accounting and oral agreement could also realize the operation of a certain business in the real world. This form has existed for quite a long time and is a relatively inefficient solution. In the later stage, with the development of computer technology, the file system stage of using excel tables instead of manual accounting appeared, which improved the productivity to a certain extent. Then to the stage of the software system with simple operation, efficient and efficient database system, the productivity is improved again, the specific problems in the real world are abstracted into data, and the real world business is represented by the flow and change of the data. In the software system, the data storage is generally composed of a relational database with multiple non-relational databases.
The database is strongly related to the system business, which requires the product manager to understand the process of data storage and query when designing the business. at the beginning of the design, it is clear what impact the business will have on the database and whether it is necessary to refer to the new technology stack. For example, a business designed by the product manager is to statistically analyze and summarize the data of multiple mysql tables with a single table volume of one million, and if you directly use mysql multi-table query, it will certainly result in slow query and lead to msyql service downtime. In this case, the solution is either to compromise on the product side or to change the technology stack.
System architecture and database scheme to choose more suitable for the company team capabilities, in the early stage of the system, simple database optimization with banknote ability will be the most cost-effective solution, but encountered mysql database banknote ability is also powerless, the introduction of key functions as the core software services will become the most cost-effective solution, how to encounter problems to choose the right solution, is to reflect your value time.
A poor man climbs up to a rich girl, the short-term sweetness is no match for the inequality of the real class, and the happy ending only exists in the fantasy of the poor boy and the TV series of Qiong Yao teacher.
How to improve the performance of data storage in the limited cost is the central idea discussed in this article.
Background knowledge
I believe that you will often come into contact with the following contents in your daily work, so I will briefly summarize it.
Relational database
Relational database is a data organization composed of two-dimensional tables and their relationships, which provides software with functions such as transaction data consistency and data persistence. It is the core storage service of the software system, and it is the database that we often come into contact with when developing and interviewing. For some small outsourcing projects, one mysql is enough to meet all business needs. It is a thing that we often come into contact with, which is actually full of ways, which will be discussed in detail in later chapters.
Advantages:
Business
Persistence
The relatively universal SQL language
problem
The requirement of hard disk Icano is very high.
The aggregate query with large amount of data is inefficient.
Index misses
The leftmost matching principle of index leads to inappropriate full-text retrieval.
Improper use of transactions can cause lock jams
The problems brought about by horizontal expansion are difficult to deal with.
Non-relational database-NoSql
As a kind of relational data storage software, MySQL database has both advantages and disadvantages, so when the amount of data in the software system continues to expand and the business complexity does not increase, we can not expect to enhance the ability of MySQL database to solve all the problems, but to introduce other storage software. Various types of NoSql are used to solve the problem of continuous expansion of software system data and continuous increase of business complexity.
Relational database is the optimization of relational database in different scenarios, which does not mean that everything will be fine with the introduction of some kind of NoSql, but it is a full understanding of the type and application difficulty of NoSQL in the market, and it is the right thing to do to choose the right storage software under the appropriate scenario.
Key- value type
In the business, the contents of some tables are often queried, but most of the query results are unchanged, so there is a Key-value storage software based on Memcached and Redis, which is widely used in the cache module of the system. Redis has more data structure and persistence than Memcached, which makes it the most widely used in KV NoSql.
Search type
In the full-text search scenario, the query optimization of the MySQLB+ tree index and the like query cannot hit the index. Each like keyword query is a full table scan, which can be supported in a table with tens of thousands of data, but the data will generate a slow query. If the business code is not well written, it will generate a read lock if the Like query is called in the transaction. ElasticSearch with inverted index as the core can perfectly meet the full-text search scene. At the same time, ElasticSearch supports massive data very well, and the document and ecology are also very good. ElasticSearch is the representative product of search type.
Document type
Document NoSql refers to a kind of NoSql that stores semi-structured data as documents. Document NoSql usually stores data in JSON or XML format, so document NoSql has no Schema. Because there is no Schema, we can store and read data at will, so the emergence of document NoSql is to solve the problem that it is not convenient to expand the table structure of relational database. The author has not used it.
Determinant
For enterprises of a certain size, some real-time and flexible data summarization is often involved in business. This kind of business is not suitable to be solved by calculating in advance, even if the business can be written out with the scheme of calculating summary in advance, but as the amount of summary increases, the final step of accumulating the summary data will slowly become very slow. The column NoSql is the product of this scenario. One of the most representative technologies in big data era, the common one is HBase, but the application of HBase is very heavy, and it often needs a whole set of Hadoop ecology to run. The author's company uses Aliyun's AnalyticDB, a column storage software compatible with MySql query statements. The powerful query ability of summary + column storage software is enough to support a variety of real-time and flexible data collection services.
Case
Taking 2021 as the time node, most systems start with the following scenarios, and then I will slowly make some adjustments in this case.
The benefit brought by the hardware upgrade is lower and lower in the future, which is the fastest optimization scheme when time and personnel are tight. The benefit brought by software optimization is that the more the later, the higher the income, but the higher the level of technical personnel is required, which is the most cost-effective optimization scheme when time and personnel permit. The optimization of hardware and software is not mutually exclusive, and both can approach the upper limit of MYSQL performance when needed.
Hard optimization-banknote capacity
Stage one
Increase the disk I / O and try to use the SSD disk (quality improvement)
Increase memory and query cache space
Increase the number of CPU cores and increase execution threads
Stage two
Change from self-built mysql to mysql service from service provider
Turn on the self-contained read-write separation function
Stage three
Service provider mysql service replaced with cloud native distributed database
Turn on the self-contained read-write separation function
Turn on the function of self-built sub-meter.
Soft Optimization-query-OLTP
OLTP is mainly used to record the occurrence of certain business events, such as user behavior. When the behavior is generated, the system will record when and where the user did what. Such a row (or rows) of data will update and process the data in the database in the way of additions, deletions and modifications, requiring high real-time performance, strong stability and ensuring the success of timely data update. Like common business systems, all belong to OLTP. The databases used are all databases with transactions, such as MySlq, Oracle and so on. For OLTP, improving the speed of query and service stability are the core of optimization.
Slow query
SQL that discovers efficiency problems through slow query logs
Problem sql troubleshooting direction
There is a problem with the index design.
There is a problem with the SQL statement
Database selected wrong index
Single meter has a large volume.
Explain concrete analysis
View sql execution comparison rate
Check the index hit (key)
Mysql optimizer
When the optimizer selects an index, it refers to the cardinality of the index (Cardinality)
The cardinality is automatically maintained and estimated by MySQL and may not be accurate.
There is a problem when the index mishit or misused the index is the optimizer.
Analyze can recalculate index information and recalculate cardinality
Mandatory indexing
The force keyword can force the use of indexes, forcing index to be specified on business code
Overlay Index-the ideal hit Index
An override index means that the query statement uses the same index from execution to return results (unique, normal, federated index, etc.)
Overlay index can have intersection and reduce back table query
If more than one index is used in the query of the data, the index is not overwritten
You can use overridden indexes by optimizing SQL statements or federated indexes
Count () function
Count (non-indexed field)-override index cannot be used, which is theoretically the slowest
Count (index field)-you can overwrite the index, but you still need to determine whether the field is null each time
Count (primary key)-ditto
Count (1)-only scan the index tree, no process of parsing data rows, theory is faster, but can still determine whether 1 is null
Count (*)-MySQL specially optimizes the number of data directly returned by the count (*) function in the index tree.
ORDER BY
Index overlay can skip generating intermediate result sets and output query results directly.
The ORDER field needs to be indexed and conditional with WHERE and in the same index as the output
Minimize extra sorting and specify where conditions
The combination of where statement and ORDER BY statement satisfies the leftmost prefix
Most efficient-index coverage (few scenarios, less chance of meeting)
Paging query
Find a way to overwrite the index first.
First find out the id of the required data, and then return to the table to get the final result set.
Index push-down
KEY store_id_guide_id (store_id,guide_id) USING BTREE
Select * from table where store_id in (1, 2) and guide_id = 3
Before MySQL5.6, you need to query store_id in with an index, and then add all tables to verify that film_id = 3
After MySQL5.6, if the index can be read, use index filtering directly
Loose index scan
KEY store_id_guide_id (store_id,guide_id) USING BTREE
Select film_id from table where guide_id = 3
New features of MySQL8.0
Loose index scanning can break the "left principle" and solve the problem of the loss of the leader.
The efficiency is lower than the joint index.
Function operation
Perform functional operations on the index field, and the optimizer will abandon the index
In this case, the package may be: time function, string to number, character encoding conversion.
Optimize the use of server-side logic instead of mysql functions
The volume of single meter is too large.
Upgrade mysql, different mysql software can carry different single table volume. From my current experience, it is no problem to query the hit index in the case of Aliyun polardb Cluster Edition single table 200 million (high priority).
Data settlement-for example, pipelining data can be settled at a certain point in time to get the latest value, and the settled pipeline is transferred to the backup table (priority).
Hot and cold separation of data-the data that cannot be settled is distinguished from the frequency of the query, which is transferred to another table with a low frequency, and the entry of the query is distinguished in business (priority).
Distributed database sub-table-turn on the distributed database with single sub-table function, distributed database components manage the insertion and query after the sub-table (priority)
Code implementation sub-table-split a single table into multiple tables according to certain rules. After splitting in most of the framework ORM of PHP and GO, you need to modify the frame ORM. ORM in JAVA has native support. It is recommended to consider it at the beginning of the project. The more difficult it is (low priority).
Soft optimization-write update delete
Lock
Self-google/baidu
Watch lock
Metadata lock
Self-google/baidu
Self-google/baidu
Self-google/baidu
According to the granularity, MySQL locks can be divided into global locks, table-level locks and row locks.
Global lock
Table-level locks are divided into table locks (data locks) and metadata locks
Row locks lock rows of data, divided into shared locks and exclusive locks
Resolve deadlock
Adjust innodb_lock_wait_timeout parameters
Active deadlock detection: innodb_deadlock_detect
The default is 50 seconds, that is, the lock has not been acquired after waiting for 50 seconds, and an error is reported in the current statement
If the waiting time is too long, you can shorten this parameter appropriately.
Roll back less costly transactions when deadlocks are found
Enabled by default
Parameter configuration
Do not start a transaction when it is not necessary
Keep the query outside the transaction as far as possible to reduce the number of locked rows
Avoid long transaction time and do not trigger http requests in the transaction
Proactively view transaction status
Show processlist;SELECT * FROM information_schema.INNODB_TRX; / / long transaction SELECT * FROM information_schema.INNODB_LOCKs; / / Lookup SELECT * FROM information_schema.INNODB_LOCK_waits; / / View blocked transaction search business
Number of search lines below 100000-mysql hard to carry
Upgrade the cpu, io, memory hardware of mysql
More than 100000 search lines-introduce Elasticsearch
The inverted index of Elasticsearch is suitable for full-text search, but the flexibility of data composition is poor.
Data synchronization
Synchronize to Elasticsearch when the business code changes the data
Canel subscription mysql log triggers synchronization
Elasticsearch-index
Consists of a list of documents with the same fields-table analogous to mysql
Once the field type is set, modification is prohibited and new fields are allowed.
Specific method self-google/baidu
Elasticsearch-Document
Data documents stored by users in es-rows analogous to mysql
Consists of metadata and Json Object
Google/baidu metadata and Json Object details on your own
Elasticsearch- word splitter
Self-google/baidu
Elasticsearch- inverted index (key)
Self-google/baidu
Elasticsearch- aggregation analysis
Self-google/baidu
Statistical Services-OLAP
OLAP is used for decision analysis of data relative to OLTP transaction processing scenario. It is an offline data warehouse idea used in big data analysis, not a specific technology stack. When your solution can reflect the idea of OLAP analysis and processing, then it is OLAP.
The early construction of data warehouse mainly refers to modeling and aggregating the business databases such as ERP, CRM, SCM and other data into the data warehouse engine according to the requirements of decision analysis. Its application is mainly report, the purpose is to support management and business personnel decision-making (medium-and long-term strategic decision-making). With the development of IT technology towards the Internet and mobility, data sources become more and more abundant. On the basis of the original business database, there are unstructured data, such as website log,IoT device data, APP buried data and so on. The amount of data is several orders of magnitude larger than the previous structured data.
No matter how the business facing OLAP changes, it is inseparable from the following steps: identify the analysis domain-> synchronize business data to the operation library-> data cleaning modeling-> synchronize to the data warehouse-> expose.
The computing source database is specially used for data cleaning in order to avoid affecting the performance of the business database during data cleaning. By cleaning the data of the computing source database according to business and dimensions, the ease of use and reusability of the data are increased, and the final real-time detail data is obtained, which is sent to the data warehouse, and then the data warehouse provides the final decision analysis data.
DEMO scheme
Production plan
Each link of the software can be replaced with the same function of the software, with the team's most confident software solution, then the solution is OLAP.
Thank you for your reading, the above is the content of "analyzing MySQL optimization ideas". After the study of this article, I believe you have a deeper understanding of the analysis of MySQL optimization ideas, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.