In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-23 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
The following brings you what is the loss of Session and the solution of nginx to solve this problem. I hope it can bring some help to you in practical application. Load balancing involves more things, there are not many theories, and there are many books on the Internet. Today, we will use the accumulated experience in the industry to do an answer.
This is the last post about "load balancing" in the "distributed concern series". Later, we will continue to talk about other topics related to "high availability", mainly current limiting, downgrade, circuit breaker, etc., but the details have not been decided yet. At the end of the article, we attach the relevant articles of high availability that have been sent before for you to review again.
I don't know if the following scene has ever appeared in front of you.
Brother developer Z shouted to the Y brother of operation and maintenance: "Brother Y, now the system has a good card, just got on a wave of activities, quickly help me add a few machines to top it up."
Brother Y replied, "No problem. I'll do it in a minute."
Then found that the pressure on the database rose rapidly, DBA yelled: "Brother Z, what the fuck are you doing? the database is going to be destroyed by you."
Then the customer service box also exploded, more and more users said that shortly after landing, the operation quit, and then login, and then quit, in the end whether or not to do business.
These problems are all caused by a "Session loss" problem.
What is the loss of Session
I believe Session should be known to most Coder. It is a concept that aims to identify multiple visits of the same user as "the same user" in the system. In addition, it can also be used to reduce repeated access to DB or remote services to obtain information related to the user, so as to improve performance.
In the scenario where we have done load balancing, if the selected load policy is hash, it will cause a side effect of Session. As in the case above, once a user accesses CVM An instead of visiting server B for some reason, problems such as "login status loss" and "cache penetration" will occur.
Why does the hash strategy have this problem? First of all, it is necessary to understand how hash works. The hash policy is such a hash function as shown in the following figure. When the function is constant, An always corresponds to 01, B corresponds to 04, and C corresponds to 08.
▲ pictures come from the Internet, and the copyright belongs to the original author.
Take the ip_hash strategy in nginx as an example. Because we think that under normal circumstances, users'ip will not change in a short period of time, when we choose to use ip_hash policy for load balancing, it means that we expect the same user to access the same server all the time, as shown in the following figure.
The hash function in the ▲ diagram is the simplest random example.
In this way, we only need to cache the user-related information on this server in the process, which can improve the performance with a very high performance-to-price ratio.
At this time, the client and the server are equivalent to establishing a trust and getting to know each other. This trust is "Session".
However, when we added a server, things changed.
The hash function in the ▲ diagram is the simplest random example.
At this time, our original expectations were undermined. Because the user's link to the serial number 0 node becomes the link to the serial number 3, the "Session loss" problem mentioned earlier occurs. At the same time, the in-process cache done on the sequence number 0 node is invalid, and there is no user-related cache on the sequence number 3 node, resulting in a large amount of data to be obtained from downstream DB or remote services. You should know that when it comes to network communication, performance is bound to deteriorate significantly, and Icano and serialization are time-consuming work. More importantly, once this happens to a large number of users at the same time, it may hang because the back-end DB and remote services are instantly unable to handle the surge of high-density requests. This is not over, if the current program does not have some fault isolation or downgrade strategy, it will further produce the butterfly effect, resulting in the slow response of the whole large system. It can be said that "a pot of porridge is spoiled by rat shit".
2. How does nginx solve this problem
Since we take nginx as an example, let's start with nginx. This problem can be solved by introducing the nginx-sticky-module module into nginx. The whole process of the solution is as follows.
The ▲ picture comes from the network, and the version belongs to the original author.
As you can see, when client enters the nginx matching node for the first time, while assigning it a node, the unique ID of this node will be md5, written to cookie and returned. If this cookie value is found the next time the request is initiated, it will be forwarded directly to the node corresponding to this value. This mechanism is professionally called "Session retention".
Although cookie can be used to solve this problem, cookie also has a potential problem. If the client does not enable the cookie function, this mechanism will fail. Fortunately, at present, mainstream browsers open cookie by default.
Aside from the question: nginx was released in 2004, and in the 7 years before the emergence of nginx-sticky-module, it was also the biggest deficiency of nginx compared to competitive HAProxy, because HAProxy supports Session maintenance.
III. Other schemes maintained by Session
In addition to cookie, there are two ways to achieve a similar effect. They are called "Session replication" and "Session sharing" respectively.
01 Session replication
This is the simplest and rudest way. According to the case in the first section, the cause of the problem is that node 3 does not have a user's Session. So it's easy to think of copying Session-related Cache data before Node 3 runs. And data synchronization is continuously guaranteed among multiple nodes, that is to say, there is Session data for each user on each node.
There are many solutions to implement, especially different host programs provide some entry points more or less, even ready-to-use solutions, such as Tomcat's Delta Manager and Backup Manager, Tomcat and IIS's Filter mechanism, and so on.
The characteristic of this kind of scheme is
Advantages: natural high availability, part of the node downtime is fine. Because the session information of all connected users is stored on each node.
Disadvantages: because there is an upper limit on the memory of each computer, it is only applicable to scenarios where the session-related data size is small. Moreover, due to the need to synchronize data between multiple nodes, the problem of data consistency needs to be solved. At the same time, as there are more nodes, the greater the loss (delay, bandwidth, etc.), there is a risk of broadcast storm.
02 Session sharing
We can also achieve the same effect by storing session information in globally shared storage media, such as databases, remote caching, etc., which is a centralized solution.
The characteristic of this kind of scheme is
Advantage: no matter how many nodes increase or decrease, 100% will not result in session loss.
Disadvantages: each read and write request needs to add additional shared storage calls, increasing the network Imax O, serialization and other operations, resulting in a significant decline in performance. In addition, the storage media used for sharing not only adds additional maintenance costs, but also needs to solve a single point of problem to avoid systemic risks.
Compare their advantages and disadvantages and applicable scenarios with the previous "Session retention" solution.
Summarize these three proposals in one sentence:
Session, hold. Where you are or where you're going.
Session replication. There's the same data wherever it is.
Session sharing. All nodes share one piece of data.
The larger the system, will eventually move towards the solution of "Session sharing", because as long as the shared storage is scaled out, it can theoretically support an infinite number of users. Such as Redis, a series of NOSQL and NEWSQL and so on. Like the following, it integrates "large scale", "high availability" and "good effect".
After reading the above about the loss of Session and the solution of nginx to solve this problem, if there is anything else you need to know, you can find what you are interested in in the industry information or find our professional and technical engineer to answer, the technical engineer has more than ten years of experience in the industry.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.