The focus of distributed system-- detailed explanation of "stateless" 07/13 Update SLTechnology News&Howtos

The focus of distributed system-- detailed explanation of "stateless"

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

If this is the second time to see my article, you are welcome to scan the code at the end of the article to subscribe to my personal official account (cross-border architect).

The length of this article is 2728 words. It is recommended to read for 8 minutes.

Insist on originality, each article is written by heart ~

The previous two chapters, "data consistency" and "High availability", are essentially a way to improve the whole by increasing complexity.

Then we started talking about something that makes the system simpler and easier to maintain-"easy to scale", and the first article to bear the brunt was "stateless", also known as "stateless".

Brother Z, let's first introduce you to what "status" is.

First, the first acquaintance of the "state"

Previously, in the fourth article of "load balancing" (distributed system concerns-- can you add machines at will after doing "load balancing"? There is an example mentioned in, let's dig it up again.

Brother developer Z shouted to the Y brother of operation and maintenance: "Brother Y, now the system has a good card, just got on a wave of activities, quickly help me add a few machines to top it up."

Brother Y replied, "No problem. I'll do it in a minute."

Then found that the pressure on the database rose rapidly, DBA yelled: "Brother Z, what the fuck are you doing? the database is going to be destroyed by you."

Then the customer service box also exploded, more and more users said that shortly after landing, the operation quit, and then login, and then quit, in the end whether or not to do business.

The root cause of the problem in this case is that there are a large number of "stateful" business processes in the system.

Stateful and stateless

In its 1984 book, N.Wirth summed up the definition of a program as follows: program = data structure + algorithm. (this summary is also the title of the book.)

This is an interesting inspiration, affected by it, Ze believes that what the program does is essentially "the movement and combination of data" in order to achieve the desired results. How to move and how to combine is determined by the "algorithm", so Ze extends a new definition: data + algorithm = result.

The "results" you get through the program are actually the same as the "results" of anything you accomplish in your daily life. Any "result" is that you process and transform the original "raw material" through a series of "actions", and finally get the "result" you expect.

For example, you turn normal temperature water into 100 degrees water after "pouring into a kettle", "electrified heating" and so on. This is such a process.

As in the case of boiling water, it often takes several "actions" to get a "result" most of the time.

What if you want to reduce the total cost (e. G. time) of these "actions" at this time?

Nature is to refine what needs to be done over and over again and let it do it only once. In the program, part of the "data" is put into a "temporary storage area" (usually local memory) to be shared by the relevant "actions".

But this leads to the need to add a relationship to indicate which "staging area" each "action" is associated with. Because in the program, "action" may be "multithreaded".

At this time, this "action" becomes "stateful".

Digression: the environment in which multiple "actions" share the same "temporary storage area" is often referred to as "context".

Let's have an in-depth talk about state.

The "temporary storage area" is "data", so it can be understood that "having data" is equivalent to "stateful".

The scope of function of "data" in the program is divided into "local" and "global" (corresponding to local and global variables), so "state" can also be divided into two types, one is local "session state". The other is the global "resource state".

Digression: because some servers are not only responsible for computing, but also provide their own scope of "data" out, these "data" is an integral part of the server, which is called "resources". Therefore, in theory, "resources" can be used by each "session", so it is a global state.

In this article, "statefulness" all refers to "session state".

As opposed to "stateful", "stateless" means that all the "raw materials" needed for each "processing" are provided by the outside world, and there is no "temporary storage area" inside the server. And the request can be submitted to any replica node on the server side, and the processing result is exactly the same.

There is a class of methods that are inherently "stateless", which are "algorithms" responsible for expressing movement and composition. Because its essence is:

Receive "raw material" (input parameter)

"process" and return "results" (output)

Why are the mainstream views on the Internet saying that methods should be made more "stateless"?

Because we are more used to writing "stateful" code, but "stateful" is not conducive to the scalability and maintainability of the system.

In a distributed system, "stateful" means that a user's request must be submitted to the server that holds its relevant state information, otherwise these requests may not be understood. As a result, the server cannot freely schedule user requests (for example, it is useless to temporarily add more machines during Singles Day holiday).

At the same time, it also leads to poor fault tolerance. if the server that holds the user information goes down, all the recent interactions of the user cannot be transparently transferred to the standby server. unless the server synchronizes the status information of all users with the primary server at all times.

These two questions are in the fourth article of load balancing (distributed system concerns-- can machines be added casually after doing "load balancing"? ) is also mentioned in.

However, if you want to achieve better scalability, you need to transform the "stateful" processing mechanism into a "stateless" processing mechanism as much as possible.

The "stateless" treatment transforms the "stateless" processing process into "stateless". The idea is relatively simple and the content is not much.

First of all, the status information is pre-positioned to enrich the input parameters, and the data needed for processing is transferred to the input parameters through the upstream client as far as possible.

Of course, the drawback of this scheme is also obvious: the size of network packets will be larger.

In addition, if multiple interactions are involved in the interaction between the client and the server, the data needed in the subsequent server processing needs to be passed back and forth to avoid the need for temporary storage on the server.

▲ orange request, green response

The purpose of these modifications is to minimize the occurrence of code similar to the following.

Func () {

Return iTunes +

}

Instead, it becomes:

Func (I) {

Return iTunes 1

}

To do this "stateless" job better depends on your reasonable layering in architecture design or project design.

Try to float session state-related processing to the front layer, because only the front layer is in contact with the system user, so that other lower layers can use "stateless" as a universal standard.

At the same time, because the session state is concentrated in the front layer, even if the real state is lost, the cost of rebuilding the state is much lower.

For example, three-tier architecture, to ensure that BLL and DAL are not stateful, the maintainability of the code is greatly improved.

If it is a distributed system, make sure that those serviced programs are not stateful. In addition to improving maintainability, it is also greatly conducive to grayscale release and Ahand B testing.

Digression: here, the purpose of layering is to show that only by separating IO-intensive programs from CPU-intensive programs is the real way to "stateless". Once separated, CPU-intensive programs are naturally "stateless".

In this way, we can better do "flexible expansion". Because the common scenarios that require "elastic expansion" generally refer to the time when the CPU load is too high.

Finally, if none of the above is appropriate, you can use shared storage as a degraded scenario, such as remote caching, databases, and so on. You can then recover from these shared storage when the state is lost.

Therefore, the ideal state storage point. Either at the front end or at the lowest storage tier.

To sum up, everything has two sides, and as mentioned earlier, we don't want all business processes to be "stateless", but just pick some of them. In the end, we still look at "value" and "performance-to-price ratio".

For example, to transform all the processes of a real-time chat tool with "status" as the core to be "stateless" is a bit of a loss.

The focus of distributed system-- the first understanding of "High availability"

Distributed system concerns-only this article is needed to get through the "load balancing" properly.

How to implement "load balancing", which is the focus of distributed system?

Distributed system focus-can you add machines as soon as you do "load balancing"? These three tricks to help you!

Distributed system concerns-"circuit breakers" and best practices that 99% of people can understand

The focus of distributed system-- want to pass customs and "limit current"? Just this one.

The focus of distributed systems-- the last big move to keep your system "strong"-- "downgrade"

Distributed system concerns-"compensation" and best practices that 99% of people can understand

Author: Zachary

Source: https://www.cnblogs.com/Zachary-Fan/p/stateless.html

If you like this article, you can click "recommendation" in the lower right corner.

This will give me some feedback. :)

Thank you for your help.

▶ about the author: Zhang Fan (Zachary, personal WeChat account: Zachary-ZF). Persist in polishing each article with high quality and originality. Welcome to scan the QR code below.

Publish the original content regularly: architecture design, distributed system, product, operation, some thinking.

If you are a junior programmer, you want to promote but don't know how to do it. Or as a programmer for many years, I fell into some bottlenecks and wanted to broaden my horizons. Welcome to follow my official account "Cross-border architect", reply to "Technology" and send you a mind map that I have collected and sorted out for a long time.

If you are an operator, there is nothing you can do in the face of a changing market. Or you may want to understand the mainstream operation strategy in order to enrich your "warehouse". Welcome to follow my official account "Cross-border architect", reply to "Operation" and send you a mind map that I have collected and sorted out for a long time.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.