Basic theory of session conversation 07/01 Update SLTechnology News&Howtos

Basic theory of session conversation

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

Outline of this section

What is session persistence

When session persistence is needed

Classification of session persistence

What is session persistence

Session persistence is one of the most common problems in load balancing, and it is also a relatively complex problem. Session persistence is sometimes called sticky session (Sticky Sessions). Session persistence is a mechanism on the load balancer that can identify the relevance of the interaction between the client and the server, while ensuring that a series of related access requests will be distributed to a server.

When session persistence is needed

Before we discuss this issue, we must take some time to figure out some concepts: what is Connection, what is Session, and the difference between the two. It is important to emphasize that if we are just talking about load balancing, sessions and connections often have the same meaning.

From a simple point of view, if the user needs to log in, it can be simply understood as a session; if you don't need to log in, it's a connection.

For packets in the same connection, the cloud load balancer will convert them into NAT and forward them to a fixed server at the back end for processing. There is a special table inside the cloud load balancer system to record the status of these connections, including: [source IP: Port], [destination IP: Port], [Server IP: Port], idle timeout (Idle Timeout), and so on. Because the table that records the connection status inside the load balancer consumes the memory resources of the system, the table cannot be infinitely large, and all traditional vendors will have certain restrictions. The size of this table is generally called the maximum number of concurrent connections, that is, the number of connections that the system can hold at the same time. In the current connection state table item of load balancer, a parameter of idle timeout (Idle Timeout) is designed. When there is no traffic passing through the connection in the Idle Timeout, the cloud load balancer will automatically delete the connection entry and release system resources.

After the connection is deleted, the request from the client cannot be guaranteed to continue to be sent to the same backend server and needs to follow the traffic distribution policy of the load balancer.

In some situations where login status is required, a session (session) is required between the client and the server to record all kinds of information about the client. For example, in most e-commerce applications or online systems that require user identity authentication, a client and server often go through several interactions to complete a transaction or a request. Because these interactions are closely related, the server often needs to know the results of the last interaction when performing one of the interaction steps of these interactions. This requires that all these related interactions are done by one server, and cannot be distributed to different servers by load balancers, otherwise abnormal scenarios may occur:

The client enters the correct user name and password, but repeatedly jumps to the login page

The user enters the correct CAPTCHA, but always prompts the CAPTCHA error

The items that the client put into the shopping cart are missing.

...

Therefore, the significance of the session persistence mechanism is to ensure that requests from the same client are forwarded to the same back-end server for processing in appropriate situations. In other words, multiple connections established between the client and the server are sent to the same server for processing. If a load balancing device is deployed between the client and the server, it is likely that the multiple connections will be forwarded to different servers for processing. If there is no synchronization mechanism of session information between servers, other servers will not be able to identify the user, resulting in exceptions when the user interacts with the application system.

The load balancer wants to forward the connections and requests from the client to multiple servers at the back end to avoid excessive load on a single server, while the session persistence mechanism requires that some requests be forwarded to the same server for processing. Therefore, in the actual deployment environment, we should choose the appropriate session persistence mechanism according to the characteristics of the application environment.

Session persistence type

Session persistence can be divided into three categories, session sticky,session LBcluster and session server, and these three session binding methods have their own advantages and disadvantages and adapt to different scenarios.

1 session sticky

Session sticky, that is, session binding, that is, the access of the client is dispatched to a fixed server through some algorithm, and this implementation is mainly realized by the scheduling algorithm of the scheduler, for example, ip_hash is provided in the Nginx reverse proxy function (each request is allocated according to the access ip result. In this way, visits from the same ip will be scheduled to the same server, which can effectively solve the session sharing problem of dynamic web pages. Url_hash (this method distributes requests according to the hash results of accessing url, so that each url is directed to the same back-end server, which can further improve the efficiency of the back-end cache server. Nginx itself does not support the url_hash algorithm) and the more powerful consistent hash algorithm. This scheduling method is based on four-layer session scheduling, which is very coarse-grained.

A very important parameter in session binding is the connection timeout value, and the load balancer sets a time value for each session in the held state. If the interval between the last completion of a session and the next visit is less than the timeout value, the load balancer will persist the new connection; but if the interval is greater than the timeout value, the load balancer will treat the new connection as a new session and load balance. The implementation of this kind of session is simple and only needs to be realized according to the information of three or four layers of the packet, which is more efficient.

However, the problem with this approach is that when multiple clients access the server through proxy or address translation, the requests are assigned to the same server because the source address is the same, which will lead to a serious load imbalance between the servers. In another case, the same client produces a large amount of concurrency, requiring session persistence while being assigned to multiple servers for processing. At this time, the session persistence method based on the client source address will also lead to the failure of load balancing. When this happens, you must consider using other methods of session persistence.

2 Session Lbcluster

Because session sticky can not achieve session maintenance and high availability in scheduling, as long as one of the hosts is down, it means that all the sessions maintained by this host will be lost, which is not only a bad experience for users, but also a loss for a site, so people begin to think about whether every server in the back end can carry all server sessions. Gradually found the solution, that is to implement session clustering, session clustering, as the name implies, that is to combine all the services that maintain the session into a cluster to maintain all the session information of the site, so that we do not have to worry about the loss of user information because of a host.

This approach solves the problem of loss of user session, and users will no longer have the problem of "loss of items put in the shopping cart by the client". However, solving the class A problem will also bring class B problems. Each session server has to handle the front-end user requests and synchronize the session to other hosts. If this is a large number of service sites, then each host will produce a large number of IO operations when synchronizing the session information of other hosts and sending the session it maintains to other servers, which makes the pressure on each server extremely great. The performance of processing front-end requests is greatly reduced. Moreover, synchronization is realized by multicast, and a large number of servers synchronize their sessions to other hosts at the same time, which will consume a lot of bandwidth.

3 Session Server

In view of the new problems caused by session clustering, we select a group of servers to specialize in session management for users. The back-end server only needs to write its own session to the back-end session server. When the user's request arrives, it only needs to be compared with the session value in session server. So how does session Server store this session information, so there are several ways to store it:

1) Database storage

Session information is stored in database tables to realize the sharing of Session information among different application servers. This method is suitable for websites with low database visits.

Pros: easy to implement

Disadvantages: because the database server is more difficult to expand than the application server and the resources are more valuable, in highly concurrent Web applications, the biggest performance bottleneck usually appears in the database server. Therefore, if Session is stored in a database table, frequent database operations will affect the business.

2) File system storage

Session sharing among servers is achieved through a file system such as NFS. This method is suitable for websites with small concurrency.

Advantages: each server only needs mount to store the disk of Session, which is relatively simple to implement.

Disadvantages: the performance of NFS for high concurrent read and write is not high, and there are big bottlenecks in hard disk Icano performance and network bandwidth, especially for frequent read and write operations of small files such as Session.

3) Memcached storage

Use Memcached to save Session data and read it directly in memory.

Advantages: high efficiency, much faster read and write speed than when stored in the file system, and multiple servers sharing Session is also more convenient, these servers can be configured to use the same set of memcached servers, reducing the extra workload.

Cons: data in memory will be lost in case of downtime, but it is not a serious problem for Session data. If the site has too much traffic and too much Session, memcached will delete the parts that are not commonly used, but if the user continues to use it after being isolated for a period of time, the problem of reading failure will occur.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.