Optimization of Mysql to improve the running efficiency of PHP 07/09 Update SLTechnology News&Howtos

Optimization of Mysql to improve the running efficiency of PHP

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "the optimization of Mysql improves the operation efficiency of PHP". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "the optimization of Mysql improves the operation efficiency of PHP".

The database master is separated from replication and read and write.

1. Master records the changes in the binary log, and slave copies the binary of master into its relay log and returns the data to its own data to achieve the purpose of replicating the data of the master server. Master-slave replication can be used as: database load balancing, database backup, read-write separation and other functions. 2. Configure master server master to modify my.ini/my.conf [mysqld] log-bin=mysql-bin / / enable binary log server-id=102 / / server unique ID3, configure slave server slave log-bin=mysql-bin / / enable binary log server-id=226 / / server unique ID4, authorize slave server GRANT REPLICATION SLAVE ON *. * to 'slavename'@'IP' identified by' root'5 on master server. Use change master to master_host= "masterip" on the slave server Master_user= "masteruser", master_password= "masterpasswd" 6. Then use the start slave command to start master-slave replication. Don't forget to restart the server after each configuration change, and then use show master/slave status to view the master / slave status on the master / slave server. The realization of database read-write separation depends on MySQL middleware, such as mysql_proxy,atlas and so on. Through the configuration of these middleware to separate the read and write of the master and slave server, make the slave server bear the responsibility of being read, so as to reduce the burden of the master server.

Sharding of the database

When the amount of data in the data table in the database is very large, the pressure such as index or cache is very great. Sharding the database so that it is stored in multiple database servers or tables respectively to reduce the query pressure. There are vertical segmentation, horizontal segmentation and joint segmentation.

Vertical sharding: when there are a lot of data tables, the tables in the database that are closely related (like a module, often join queries) are split out and placed on different master and slave server.

Horizontal segmentation: when there are not many tables and the amount of data in the table is very large, in order to speed up the query, we can use algorithms such as hashing to divide a data table into several data tables and put them on different servers to speed up the query. The difference between horizontal sharding and data table partitioning lies in its storage media.

Joint segmentation: more often, the amount of data in the data table and the table is very large, so it is necessary to carry out joint segmentation, that is, to split the database into a distributed matrix for storage.

Each of these database optimization methods can be written in an article, which is broad and profound. If we understand and remember these ways, we can make purposeful selection and optimization when necessary to achieve the efficiency of the database.

Index optimization

In MySQL, the index belongs to the concept of storage engine level, and different storage engines implement the index differently. The following mainly discusses the index implementation of MyISAM and InnoDB storage engines.

MyISAM index implementation

The MyISAM engine uses B+Tree as the index structure, and the data field of the leaf node stores the address of the data record. The following is a schematic diagram of the MyISAM index:

Here, the table has a total of three columns. Assuming that we have Col1 as the primary key, figure 1 shows the primary index (Primary key) of a MyISAM table. You can see that MyISAM's index file only holds the address of the data record. In MyISAM, there is no structural difference between the primary index and the secondary index (Secondary key), except that the primary index requires that the key is unique, while the key of the secondary index can be repeated. If we build a secondary index on Col2, the structure of the index is shown in the following figure:

It is also a B+Tree, the address where data records are stored in the data field. Therefore, the index retrieval algorithm in MyISAM is to first search the index according to the B+Tree search algorithm, take out the value of its data field if the specified Key exists, and then read the corresponding data record with the value of the data field as the address.

The indexing method of MyISAM is also called "nonclustered", which is called to distinguish it from InnoDB's clustered index.

InnoDB index implementation

Although InnoDB also uses B+Tree as the index structure, the implementation is quite different from that of MyISAM.

The first major difference is that InnoDB's data file itself is an index file. As you know from the above, the MyISAM index file and the data file are separate, and the index file only holds the address of the data record. In InnoDB, the table data file itself is an index structure organized by B+Tree, and the leaf node data field of this tree holds the complete data record. The key of this index is the primary key of the data table, so the InnoDB table data file itself is the primary index.

Figure 3

Figure 3 is a schematic diagram of the InnoDB main index (which is also a data file), and you can see that the leaf node contains the complete data record. This kind of index is called clustered index. Because the data files of InnoDB are aggregated by the primary key, InnoDB requires that the table must have a primary key (MyISAM can not be explicitly specified). If it is not explicitly specified, the MySQL system will automatically select a column that can uniquely identify the data record as the primary key. If such a column does not exist, MySQL automatically generates an implied field for the InnoDB table as the primary key, which is 6 bytes in length and the type is long shaping.

The second difference from the MyISAM index is that the InnoDB secondary index data field stores the value of the corresponding record primary key instead of the address. In other words, all secondary indexes of InnoDB refer to the primary key as the data field. For example, figure 4 is a secondary index defined on Col3:

Figure 4

Here, the ASCII code of English characters is used as the comparison criterion. The implementation of clustered index makes the search by primary key very efficient, but the secondary index search needs to retrieve the index twice: first retrieve the secondary index to obtain the primary key, and then use the primary key to retrieve the record in the primary index.

Understanding the index implementation of different storage engines is very helpful for the correct use and optimization of indexes. For example, after knowing the index implementation of InnoDB, it is easy to understand why it is not recommended to use overly long fields as primary keys, because all secondary indexes reference the primary index, and an overlong primary index will make the secondary index too large.

For example, using non-monotonous fields as primary keys in InnoDB is not a good idea, because the InnoDB data file itself is a B+Tree, and non-monotonous primary keys will cause frequent splits and adjustments of data files in order to maintain the characteristics of B+Tree when inserting new records, which is very inefficient, while using self-increasing fields as primary keys is a good choice.

Optimization of data query

In every query case that consumes a lot of time, you can see some unnecessary additional operations, some operations are repeated many times, some operations are executed too slowly, and so on. The purpose of optimizing queries is to reduce and eliminate the time spent on these operations.

First, optimize data access is the first choice

The most basic reason for query performance is that too much data is accessed. Therefore, inefficient queries are generally analyzed in two steps:

Confirm that the application is retrieving a large amount of data that is more than needed. This usually means that too many rows have been accessed, but sometimes too many columns have been accessed. Confirm that the MySQL server layer is analyzing a large number of rows of data that are more than needed.

1.1. Whether unwanted data is requested from the database

When accessing the database, only the required rows and columns should be requested. Requesting extra rows and columns will consume the CPU and memory resources of the MySQL server and increase network overhead.

1. When dealing with paging, you should use LIMIT to restrict MySQL to return only the data you need, instead of returning all the data to the application, and then the application filters the unwanted rows.

2. Avoid using SELECT without thinking when associating multiple tables or getting data from a single table.

3. When some data is used many times, you can consider caching the data to avoid querying MySQL every time you use it.

1.2. whether MySQL is scanning additional records, you should let MySQL query data in the most appropriate way.

For MySQL, the simplest measure of query cost has three metrics: response time, number of rows scanned, and number of rows returned. The main consideration here is to improve the way of scanning, that is, the way of querying data.

The ways to query data are full table scan, index scan, range scan, unique index query, constant reference and so on. In these query methods, the speed is from slow to fast, and the number of rows scanned is from more to less. You can use the type column in the EXPLAIN statement to reflect which way the query is used.

You can usually improve the way of querying data by adding appropriate indexes to reduce the number of scanned rows of data as much as possible and speed up the query.

For example, when you find that a query needs to scan a large number of data rows but return only a small number of rows, consider using an override index, that is, putting all the columns you need into the index. In this way, the storage engine can return the results without going back to the table to get the corresponding rows.

Second, the method of refactoring query

When designing a query, you need to consider whether a complex query needs to be divided into multiple simple queries. In my impression, I have heard a rule of thumb countless times: what can be done in the database should not be put in the application, the database is much more powerful than we thought. This rule of thumb was told by an Oracle guy when Huaxia Foundation used Oracle to write SQL, and then I really suffered a lot when I applied it to MySQL.

Of course, the reason for this is that Oracle and MySQL do not have the same processing logic, and the cost of network communication, query parsing and optimization is not as high as it used to be. Again, the rule of thumb only works in a particular cage.

Decompose complex queries:

You can split a large query into multiple small queries, each of which completes only a small portion of the entire query task and returns only a small portion of the results each time.

Deleting old data is a good example.

If you perform a large delete operation with only one statement at a time, you may need to lock a lot of data at once, occupy the entire transaction log, deplete system resources, and block many small but important queries. Decomposing a large delete operation into several smaller delete operations can spread the one-time pressure on the server to multiple operations, affect MySQL performance as little as possible, reduce the waiting time of locks during deletion, and reduce the delay of MySQL master-slave replication. I've been using this method.

Another example is to decompose an associative query, that is, to make a single table query for each table to be associated, and then associate the results in the application. That's what he did when I was coding at a previous company with a colleague who had been at Ali for many years. Later, I silently despised him in my heart, because I had such a rule of thumb (what can be done in the database should not be put in the application, the database is much more powerful than we thought), and I acted to keep things that can be solved with one SQL and never use two SQL.

Of course, there is another reason for dealing with rules of thumb: the logic of getting data is separated from the business code as much as possible, which makes it convenient to switch databases later. Is that actually the case? Not necessarily. I suffered a lot from my ignorance at that time, and then decomposed most of the SQL I wrote because of the performance problems of SQL.

Refactoring a query by decomposing an associated query has the following advantages:

Make caching more efficient. Many applications can easily cache the corresponding result objects in the form table. After decomposing the query, executing a single query can reduce lock competition. By associating in the application layer, it is easier to split the database, and it is easier to achieve high performance and scalability. The query itself may also be more efficient. Queries with redundant records can be reduced. Make association queries at the application layer

It means that the application of a record needs to be queried only once, while making an associated query in the database may require repeated access to part of the data. From this point of view, such refactoring may also reduce network and memory consumption. Further, this is equivalent to implementing a hash association in the application, rather than using the nested loop association of MySQL. Hash associations are much more efficient in some scenarios.

Optimization of database design

1. The design of the database accords with the third normal form, and there can be some data redundancy for the convenience of query. 2. Select the priority of data type int > date,time > enum,char > varchar > blob. When selecting a data type, you can consider replacing it. For example, the ip address can be converted to unsign int using the ip2long () function for storage. 3. For the char (n) type, try to keep the n value as small as possible when the data is complete. 4. Using partition command to partition a single table when building a table can greatly improve the query efficiency. MySQL supports the type of RANGE,LIST,HASH,KEY partition, of which RANGE is the most commonly used. The partition method is: CREATE TABLE tablename {} ENGINE innodb/myisam CHARSET utf8 / / Select the database engine and encode PARTITION BY RANGE/LIST (column), / / partition PARTITION partname VALUES LESS THAN / IN (n) by scope and predefined list, / / name the partition and define the partition in detail. 5. Pay attention to the difference between innodb and myisam when selecting the database engine. Storage structure: MyISAM is stored as three files on disk. While all the tables in InnoDB are saved in the same data file, it is generally supported for 2GB transactions: MyISAM does not provide transaction support. InnoDB provides transaction support for transactions. Table lock difference: MyISAM only supports table-level locks. InnoDB supports transaction and row-level locks. Full-text indexing: MyISAM supports full-text indexing of FULLTEXT type (not suitable for Chinese, so use the sphinx full-text indexing engine). InnoDB does not support it. The specific number of rows of the table: MyISAM holds the total number of rows of the table, and the query count (*) is very fast. InnoDB does not save the total number of rows in the table and needs to be recalculated. Foreign key: not supported by MyISAM. InnoDB support

A few MySQL tips

1. Keywords in SQL statements are best written in uppercase. First, it is easy to distinguish between keywords and operands. Second, when SQL statements are executed, MySQL will convert them to uppercase. Manual uppercase can increase query efficiency (although it is very small).

2. If we add or delete data rows in the database, then the data ID will be too large. Use ALTER TABLE tablename AUTO_INCREMENT=N to make the self-increasing ID count from N.

3. Add the ZEROFILL attribute to the int type to automatically fill in the data.

4. When importing a large amount of data, it is best to delete the index and then insert the data, and then add the index, otherwise, mysql will spend a lot of time updating the index.

5. When creating a database to write sql statements, we can create a file with the suffix .sql in IDE. IDE will recognize the sql syntax and make it easier to write. More importantly, if your database is missing, you can also find this file and use / path/mysql-uusername-ppassword databasename < filename.sql in the current directory to execute the sql statement of the entire file (note-u and-p followed by username password, no spaces).

Thank you for your reading, the above is the content of "the optimization of Mysql to improve the operational efficiency of PHP". After the study of this article, I believe you have a deeper understanding of the optimization of Mysql to improve the operational efficiency of PHP, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.