In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces the relevant knowledge of "what are the knowledge points related to web message queue". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
What is message queuing
Let's take a look at what Wikipedia says. It's good to learn English by the way:
In computer science, message queues and mailboxes are software-engineering components typically used for inter-process communication (IPC), or for inter-thread communication within the same process. They use a queue for messaging-the passing of control or of content. Group communication systems provide similar kinds of functionality.
In the field of computer science, message queues and mailboxes are software engineering components that are commonly used for inter-process or intra-process thread communication. They pass messages through queues-control messages or content, and group communication systems provide similar functions.
Simply summarize the above definition: message queuing is a component that uses queues to communicate.
The above definition is correct, but for now, what we call message queuing often refers to message middleware, which exists not just for communication.
Why message queuing is needed
In essence, it is because of the rapid development of the Internet and the continuous expansion of business, which makes the technology architecture need to evolve constantly.
From the previous single architecture to the current micro-service architecture, hundreds of services invoke and depend on each other. It has been a big deal since the early days of the Internet that there were 100 online users on a server, to Wechat, which now has 1 billion daily active users. We need a "thing" to decouple the relationship between services, control the rational use of resources, buffer traffic peaks, and so on.
Message queuing came into being. It is often used to implement: asynchronous processing, service decoupling, and flow control.
Asynchronous processing
With the development of the company, you may find that the request link for your project is getting longer and longer. for example, the initial e-commerce project can be rough withholding inventory and placing orders. Slowly add points service, short message service and so on. The customer may be in a hurry after the synchronous call all the way, and this is a good time for the message queue to debut.
The call link is long, the response is slow, and points and text messages do not have to be so "timely" relative to withholding inventory and placing orders. So just throw a message into the message queue at the end of the process of placing the order and return the response directly. And the points service and SMS service can consume this message in parallel.
It can be seen that message queue can not only reduce the waiting of requests, but also enable services to process asynchronously and concurrently, and improve the overall performance of the system.
Service decoupling
Above we mentioned that with the addition of points service and short message service, there may be another marketing service at this time, and then the leader said that he wanted to be a big data, another data analysis service, and so on.
It can be found that the downstream system of the order is constantly expanding. In order to cater to the frequent modification of the order service of these downstream system, any change of the downstream system interface may affect the order service. This order service group is crazy. True "core" project team
Therefore, message queue is generally used to solve the problem of coupling between systems. the order service stuffs the order-related messages into the message queue, and the downstream system subscribes to this topic. In this way, the order service is liberated!
Flow control
Everyone must have heard of "cutting peaks and filling valleys". The back-end services are relatively "weak" because the business is heavier and the processing time is longer. For example, some explosive traffic calls such as flash sale activity may not be able to stand it. Therefore, it is necessary to introduce a middleware as a buffer, and message queuing is perfect.
The request of the gateway is first put into the message queue, and the back-end service tries its best to consume the request in the message queue. A request that times out can directly return an error.
Of course, there are some services, especially some background tasks, which do not need to respond in time, and the business processing is complex and the process is long, so the incoming requests are first put into the message queue, and the back-end services are processed at their own pace. This is also very nice.
The above two cases correspond to the two situations of producers producing too fast and consumers consuming too slowly, in which message queue can play a good buffering effect.
Be careful
The introduction of message queue certainly has the above benefits, but the stability of the introduction of one more middleware system will be reduced by one level, and the difficulty of operation and maintenance will be increased by one level. Therefore, to weigh the pros and cons, the system is evolving.
Basic concepts of message queuing
There are two models of message queuing: queue model and publish / subscribe model.
Queue model
Producers send messages to a queue, a queue can store messages from multiple producers, and a queue can have multiple consumers, but there is a competitive relationship between consumers, that is, each message can only be consumed by one consumer.
Publish / subscribe model
In order to solve the problem that a message can be consumed by multiple consumers, the publish / subscribe model comes. The model sends a message to a Topic that is a topic, and all subscribers who subscribe to this Topic can consume the message.
In fact, it can be understood that the publish / subscribe model means that we have all joined a group chat. I sent a message that anyone who joined the group chat could receive it. Then the queue model is one-to-one chat, the message I send you can only pop up in your chat window, it is impossible to pop up to other people's chat window.
Speaking of which, some people say that if I send the same message to everyone in an one-on-one chat, then a message will be consumed by more than one person.
Yes, the same message is fully stored in multiple queues, that is, the redundancy of data can realize that a message can be consumed by multiple consumers. RabbitMQ uses the queue model to send messages to multiple queues through the Exchange module to solve the problem that a message needs to be consumed by multiple consumers.
You can also see here that if there is only one person in the group chat except me, then the publish / subscribe model and the queue model are actually the same.
Make a brief summary
Each message in the queue model can only be consumed by one consumer, and the publish / subscribe model is created so that a message can be consumed by multiple consumers. of course, the queue model can also solve the problem that a message is consumed by multiple consumers by storing messages to multiple queues, but there will be data redundancy.
The publish / subscribe model is compatible with the queue model, that is, it is basically the same as the queue model when there is only one consumer.
RabbitMQ uses the queue model, RocketMQ and Kafka use the publish / subscribe model.
The next content is based on the publish / subscribe model.
Common terms
Generally speaking, we call the sender the producer Producer, the consumer Consumer, and the message queue server Broker.
The message is sent from Producer to Broker,Broker to store the message locally, then Consumer pulls the message from Broker, or Broker pushes the message to Consumer and finally consumes it.
In order to improve the degree of concurrency, the publish / subscribe model often introduces the concept of queue or partition. That is, the message is sent to a queue or partition under a topic. RocketMQ is called queue, Kafka is called partition, which is essentially the same.
For example, if there are five queues under a topic, the concurrency of the topic is increased to 5, and five consumers can consume messages on the topic in parallel. Generally, strategies such as polling or key hash remainder can be used to assign messages for the same topic to different queues.
The corresponding consumers generally have the concept of group Consumer Group, that is, consumers belong to a certain consumer group. A message is sent to multiple consumer groups that have subscribed to this topic.
Suppose you now have two consumer groups, Group 1 and Group 2, both of which subscribe to Topic-a. At this point, a message is sent to Topic-a, so both consumer groups can receive the message.
Then the message is actually written to a queue in Topic, and a consumer in the consumer group consumes a queue of messages.
Physically, in addition to the copy, there will be only one message in the Broker, and each consumer group will have its own offset, or consumption point, to identify the location to which it is consumed. The news before the consumption point indicates that it has already been spent. Of course, this offset is queue-level. Each consumer group maintains the offset of each queue under the subscribed Topic.
Take a look at the picture should be very clear.
After we are basically familiar with the common terms and concepts of message queuing, let's take a look at the common core aspects of message queuing.
How to ensure that messages are not lost
As far as the common message queues in the market are concerned, as long as they are properly configured, our messages will not be lost.
Let's take a look at this picture first.
You can see that there are three phases, namely, producing messages, storing messages, and consuming messages. Let's start with these three stages to see how to ensure that the message is not lost.
Production message
When a producer sends a message to Broker, it needs to deal with the response of Broker. No matter whether it is synchronous or asynchronous, synchronous and asynchronous callbacks need to do a good job of try-catch and properly handle the response. If Broker returns an error message such as write failure, you need to retry sending. When multiple transmission failures need to be made alarm, logging and so on.
This ensures that messages are not lost during the production message phase.
Store messages
In the message storage phase, you need to respond to the producer after the message is brushed. If the message is written into the cache and the response is returned, then the machine suddenly loses power, and the producer thinks it has been sent successfully.
If the Broker is a cluster deployment, there is a multi-copy mechanism, that is, messages need to be written not only to the current Broker, but also to the replica. That is configured to write at least two machines before responding to the producer. This basically ensures the reliability of the storage. One is dead, and the other is still there (if you are afraid that both are dead.. Then some more).
What if all the machines in an earthquake room are down? Emmmmmm... Basically, big companies have a lot of work in different places.
What if there is an earthquake in all these places? Emmmmmm... At this time, you'd better care about people first.
Consumption message
Some students often make mistakes here. When consumers get the message, they directly store it in the memory queue and return it to Broker to consume successfully. This is wrong.
You need to consider what to do when the consumer goes down after you get the message in memory. So we should send it to Broker after the consumer has really executed the business logic. This is the real consumption.
So as long as we give the Broker a response after the message business logic processing is complete, the message in the consumption phase will not be lost.
Make a brief summary
It can be seen that ensuring the reliability of the message requires the cooperation of three parties.
Producers need to deal with the response of Broker, and use means such as retry and alarm in case of error.
Broker needs to control the timing of the response. In the case of a stand-alone machine, the response is returned after the message is flushed. In the case of multiple copies of the cluster, the response is returned when it is sent to two or more copies.
The consumer needs to return the response to Broker after executing the real business logic.
However, it is important to note that when the reliability of the message is enhanced, the performance is degraded, and waiting for the message to be flushed and returned after multiple copies are synchronized will affect the performance. Therefore, it still depends on the business, for example, it does not matter that one or two items may be lost in the transmission of the log, so there is no need to wait for the message to be flushed before responding.
If you process duplicate messages
Let's see if we can avoid the repetition of the news.
Suppose we send a message, just send it, regardless of the response of Broker, then we will not repeat it when we send it to Broker.
But in general, we do not allow this, so the message is completely unreliable. our basic requirement is that the message must at least be sent to Broker, then we have to wait for the response from Broker, then it is possible that the Broker has been written. At that time, the response was not received by the producer due to network reasons, and then the producer sent it again, and the message was repeated.
So the news was repeated.
We can see that message duplication is inevitable in normal business, so we can only solve the problem of message repetition from another point of view.
The key point is idempotence. Since we cannot prevent the occurrence of duplicate messages, we can only deal with the impact of duplicate messages on the business.
Idempotent processing duplicate messages
Idempotence is a mathematical concept, which we understand as the same parameter that calls the same interface multiple times and produces the same result as one call.
For example, this SQLupdate T1 set money = 150 where id = 1 and money = 100; money is 150 for how many times it is executed, which is called idempotency.
Therefore, the business processing logic needs to be modified so that the final result will not be affected in the case of repeated messages.
You can use my SQL above to make a pre-condition judgment, that is, money = 100, and modify it directly. What is more general is to do a version, that is, version number control, and compare the version number in the message with the version number in the database.
Or through database constraints such as unique keys, such as insert into update on duplicate key....
Or record the key key, such as processing the order, record the order ID, if there is a duplicate message, first determine whether the ID has been processed, if not before moving on to the next step. Of course, you can also use globally unique ID and so on.
Basically, there are only a few routines, and the real application in practice still depends on the specific business details.
How to ensure the order of messages
Order is divided into global order and partial order.
Global order
If you want to ensure the global order of messages, only one producer can send messages to Topic, and there can be only one queue (partition) within a Topic. Consumers must also consume this queue in a single thread. Such news is global and orderly!
However, in general, we do not need global ordering, and even synchronous MySQL Binlog only needs to ensure the order of single-table messages.
Partially ordered
Therefore, most of the ordered requirements are partially ordered, and we can divide the Topic into the number of queues we need, send messages to a fixed queue through a specific policy, and then each queue corresponds to a single-threaded consumer. This not only completes part of the orderly requirements, but also improves the efficiency of message processing through the concurrency of the number of queues.
In the picture, I draw multiple producers, or one producer, as long as similar messages are sent to the specified queue.
If you deal with message accumulation
The accumulation of messages is often due to the mismatch between the production speed of producers and the consumption speed of consumers. It may be due to the failure of message consumption and repeated retry, or it may be that consumers have weak spending power, and gradually there is a backlog of news.
Therefore, we need to locate the reason for the slow consumption first. If it is bug, we will deal with bug. If it is because of our weak consumption power, we can optimize the consumption logic. For example, we used to process one message at a time. This time, for example, when we insert a database, the efficiency of one insert is not the same as that of batch insertion.
If we have optimized the logic, but it is still slow, then we have to consider the horizontal expansion, increase the number of Topic queues and the number of consumers, and note that the number of queues must be increased, otherwise the newly added consumers will have nothing to spend. In a Topic, a queue is assigned to only one consumer.
Of course, whether your consumers are single-threaded or multi-threaded, then look at the specific scene. However, pay attention to the problem of message loss improved above. If you write the received message to the memory queue, then return the response to Broker, and then multithread consumes the message from the memory queue. If the consumer is down, the unconsumed messages in the memory queue will be lost.
This is the end of the content of "what are the knowledge points related to web message queuing". Thank you for your reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.