In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces the use of MySQL database internal cache examples, has a certain reference value, friends in need can refer to. I hope you will learn a lot after reading this article. Next, let the editor take you to learn about it.
As the number of visitors to our Web site increases, we will encounter a lot of challenges. To solve these problems is not only to expand the capacity of the machine, but also to establish and use the appropriate caching mechanism.
At the beginning, our Web system architecture may be like this, each link, there may be only one machine.
I. use of internal cache in MySQL database
The caching mechanism of MySQL starts from within MySQL, and the following content will focus on the most common InnoDB storage engine.
1. Set up an appropriate index
The simplest thing is to set up an index, which can retrieve data quickly when the table data is large, but there is also a cost. First of all, it takes up a certain amount of disk space, of which the combined index is the most prominent and needs to be used carefully, and the index it produces will be even larger than the source data. Second, operations such as data insert/update/delete after indexing, because the original index needs to be updated, the time-consuming will increase. Of course, in fact, generally speaking, our system is dominated by select query operations, so the use of indexes can still greatly improve the performance of the system.
two。 Database connection thread pool cache
If every database operation request needs to create and destroy a connection, it is undoubtedly a huge overhead for the database. To reduce this type of overhead, you can configure thread_cache_size in MySQL to indicate how many threads are reserved for reuse. When the thread is not enough, it is created again, and when there is too much idle, it is destroyed.
In fact, there is a more radical approach, using pconnect (database persistent connection), once the thread is created for a long time. However, in the case of a large number of visits and a large number of machines, this usage is likely to lead to "database connection exhaustion", because the establishment of a connection is not recycled and eventually reaches the max_connections of the database (maximum number of connections). Therefore, the use of persistent connections usually requires the implementation of a "connection pooling" service between CGI and MySQL to control the number of connections created by CGI machines "blindly".
3. Innodb cache settings (innodb_buffer_pool_size)
Innodb_buffer_pool_size this is a memory cache used to hold indexes and data. If the machine is exclusive to MySQL, it is generally recommended that it be 80% of the machine's physical memory. In the scenario of fetching table data, it can reduce the disk IO. In general, the higher the setting, the higher the cache hit rate.
4. Sublibrary / table / partition.
MySQL database tables generally bear the amount of data in the million level, and if it goes up, the performance will be greatly degraded. Therefore, when we foresee that the amount of data will exceed this order of magnitude, it is recommended to carry out operations such as sub-database / sub-table / partition. The best way is to design the service as a storage mode of sub-database and sub-table at the beginning of the construction, so as to fundamentally eliminate the risk in the middle and later stage. However, it sacrifices some convenience, such as tabular queries, and increases the complexity of maintenance. However, when the amount of data is tens of millions or more, we will find that they are all worth it.
2. Construction of multiple services for MySQL database
A MySQL machine is actually a high-risk single point, because if it dies, our Web service will not be available. And, as the traffic to the Web system continued to increase, one day we found that one MySQL server could not hold up, and we began to need to use more MySQL machines. When multiple MySQL machines are introduced, many new problems will arise.
1. Establish MySQL master-slave and slave database as backup
This approach is purely to solve the problem of "single point of failure", in the event of a failure of the master library, switch to the slave library. However, this practice is actually a bit of a waste of resources, because the slave library is actually idle.
2. Separate MySQL read and write, write to the main library, and read from the library.
The two databases do read-write separation, the master database is responsible for writing class operations, and the slave database is responsible for reading operations. In addition, if the master database fails, it still does not affect the read operation, and you can temporarily switch all reads and writes to the slave library (you need to pay attention to the traffic, which may drag down the slave library because of too much traffic).
3. Master and master stand by each other.
The two MySQL are not only each other's slave library, but also the master library. This scheme not only achieves the pressure diversion of traffic, but also solves the problem of "single point of failure". If there is any failure, there is another set of services available.
However, this scheme can only be used in two-machine scenarios. If the business development is still very fast, you can choose to separate the business and establish multiple masters and backups.
Establish a cache between the Web server and the database
In fact, to solve the problem of large traffic, we can not only focus on the database level. According to the Law of 28, 80% of requests focus only on 20% of hot data. Therefore, we should establish a caching mechanism between the Web server and the database. This mechanism can use disk as cache or in-memory cache. Through them, most of the hot data queries are blocked in front of the database.
1. Page static
When a user visits a page of a website, most of the content on the page may not change for a long time. For example, a news report, once published, will hardly change the content. In this case, the static html pages generated by CGI are cached locally on the disk of the Web server. Except for the first time, it is obtained through a dynamic CGI query database, and then the local disk files are returned directly to the user.
When the Web system was small, this seemed perfect. But once the Web system gets bigger, for example, when I have 100 Web servers. In that case, there will be 100 of these disk files, which is a waste of resources and difficult to maintain. At this time, some people will think, you can centralize a server to save, hehe, why not take a look at the following cache way, it is done.
two。 Single memory cache
Through the example of page static, we can see that building the "cache" on the Web machine is difficult to maintain and will cause more problems (in fact, through the apc extension of PHP, you can manipulate the native memory of the Web server through Key/value). Therefore, the in-memory cache service we choose to build must also be a separate service.
The choice of memory cache is mainly redis/memcache. In terms of performance, there is little difference between the two, but in terms of feature richness, Redis is better.
3. Memory cache cluster
When we finish building a single memory cache, we will face the problem of a single point of failure, so we have to turn it into a cluster. The simple thing to do is to add a slave to him as a backup machine. However, if the number of requests is really high, we find that the hit rate of cache is not high, and we need more machine memory? Therefore, we recommend that it be configured as a cluster. For example, similar to redis cluster.
The Redis in the Redis cluster cluster is composed of multiple master and slave groups, and each node can accept requests, so it is convenient to expand the cluster. The client can send a request to any node and return the content directly if it is its "responsible" content. Otherwise, look for the actual responsible Redis node, then inform the client of the address, and the client requests again.
All of this is transparent to clients that use caching services.
There is a certain risk when the in-memory cache service is switched. In the process of switching from cluster A to cluster B, cluster B must be "preheated" in advance (the hot data in the memory of cluster B should be the same as that of cluster An as far as possible, otherwise, a large number of requests will be made at the moment of the switch. cannot be found in the memory cache of cluster B, and the traffic directly impacts the back-end database service, which is likely to lead to database downtime.
4. Reduce database "write"
The above mechanisms are all implemented to reduce the "read" operation of the database, but the write operation is also a great pressure. The operation of writing can not be reduced, but it can be reduced by merging requests. At this point, we need to establish a modification synchronization mechanism between the memory cache cluster and the database cluster.
First, the modification request is implemented in the cache to make the external query display normally, and then the sql changes are stored in a queue, which is full or merged into a request to update the database at regular intervals.
In addition to improving the write performance by changing the system architecture, MySQL itself can also adjust the write policy to disk by configuring the parameter innodb_flush_log_at_trx_commit. If the cost of the machine allows, you can solve the problem at the hardware level by choosing the older RAID (Redundant Arrays of independent Disks, disk array) or the newer SSD (Solid State Drives, solid state drive).
5. NoSQL storage
Regardless of whether the database is read or written, when the traffic rises further, it will eventually reach the scene of "manpower is poor". When the cost of adding machines is relatively high, and it may not really solve the problem. At this time, part of the core data, you can consider using NoSQL's database. Most of NoSQL storage is based on key-value. It is recommended to use Redis,Redis as a memory cache described above, and it can also be used as a storage, allowing it to directly land data to disk.
In this way, we will separate some frequently read and written data from the database and put them in our newly built Redis storage cluster, which will further reduce the pressure on the original MySQL database. At the same time, because Redis itself is a memory-level Cache, the performance of reading and writing will be greatly improved.
Many of the solutions adopted by domestic first-tier Internet companies are similar to the above, but the cache services used are not necessarily Redis. They will have more choices, and even develop their own NoSQL services according to their own business characteristics.
6. Empty node query problem
When we finished building all the services mentioned above, we thought that the Web system was already very strong. Let's say the same thing, new problems will come. Empty node queries refer to data requests that do not exist in the database. For example, if I request to query a person whose information does not exist, the system will look it up step by step from all levels of cache, and finally find the database itself, and then come to the conclusion that it cannot be found and return it to the front end. Because all levels of cache are invalid to it, this request consumes system resources very much, and if a large number of empty node queries, it can impact the system service.
Thank you for reading this article carefully. I hope it will be helpful for everyone to share the examples of the use of the internal cache in the MySQL database. At the same time, I also hope that you will support us, pay attention to the industry information channel, and find out if you encounter problems. Detailed solutions are waiting for you to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.