In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly explains "what are the interview questions of Java message middleware". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Now let the editor take you to learn "what are the interview questions of Java message middleware"?
Why use MQ?
I believe you have heard such a saying: a good architecture is not designed, it is evolved.
This sentence also applies to scenarios where MQ is introduced. There must be a reason for using MQ, which is used to solve practical problems. Instead of seeing someone else use it, I also use it for fun.
In fact, there are many scenarios that use MQ, but there are three core ones:
Asynchronous, decoupling, peak cutting and valley filling
Async
We illustrate through an actual case: suppose system A receives a request, needs to write its own local library to execute SQL, and then needs to call the interfaces of the three BCD systems.
Suppose you need 3ms to write your local library, and 300ms, 450ms, and 200ms to call BCD.
Then the total delay of the final request is 3 + 300450200 = 953ms, which is close to 1s, which may make the user feel too slow.
At this point, the whole system looks like this:
But once MQ is used, system An only needs to send three messages to three message queues in MQ and then return them to the user.
Suppose it takes 20ms to send a message to MQ, then the user perceives that the time consuming of this interface is only 20 + 3 = 23ms, and the user is almost unaware of it.
At this point, the whole system structure looks like this:
As you can see, through the asynchronous function of MQ, the performance of the interface can be greatly improved.
Decoupling
Suppose system A needs to push the data submitted by the user to both systems B and C when an operation occurs.
At this time, the buddy in charge of system A thought: it's all right. B and C systems provide me with a Http interface or RPC interface. I'll just push the data over. The guy in charge of system A, Meizi.
As shown in the following figure:
Everything looks good, but with the rapid iteration of the business, system D also wants this data at this time. In that case, the developer of system A changed it and added a D when sending data to BC.
However, the more I went to the back, the more I found that there was trouble.
The whole system seems to send not only this data to BCD, but also the second and third data to BCD. Sometimes even add E, F and other systems, they also want this data.
And sometimes maybe system B suddenly doesn't want this data, system A should be changed, and the developer of system A has a numb scalp.
A more complicated scenario is that when the data is transmitted to other systems through the interface, sometimes some anomalies such as retry and timeout have to be considered.
Take a look at the picture below to experience this helpless scene:
At this time, it's time for our MQ to make its debut!
In this case, it is appropriate to use MQ to decouple, because the buddy in charge of system An only needs to throw the message to MQ, and other systems can subscribe to the message on demand.
Even if a system does not need this data, it will not require system A to change the code.
See if the following figure with MQ decoupling is much more refreshing!
Cut peak and fill valley
For example, our order system, when placing an order, writes data to the database. However, the database can only support about 1000 concurrent writes per second, and no matter how high the concurrency is, it is easy to go down.
During the trough, there are only more than 100 concurrency, but at the peak, the concurrency will suddenly surge to more than 5000, at which time the database must be dead.
As shown in the following picture, feel the despair of the database being beaten to death:
But after using MQ, the situation changed!
The message is saved by MQ, and then the system can consume according to its own consumption capacity, such as 1000 data per second, so that it is slowly written to the database, so that the database will not be killed:
The whole process, as shown in the following figure:
As for why it is called peak cutting and valley filling? Take a look at this picture:
If MQ is not used, there is a "peak" at the peak of concurrency, and then a "valley" with low concurrency after the peak.
However, after using MQ, the speed of consuming messages is limited to 1000, but in this way, the data generated during the peak period is bound to be overstocked in MQ, and the peak is "cut" off.
However, because of the backlog of messages, the speed of consuming messages will remain at 1000QPS for a period of time after the peak period, until the backlog of messages is consumed, which is called "filling the valley".
From the above analysis, you can see why you use MQ and what are the benefits of using MQ. I know why it is, and I understand why my system uses MQ.
In this way, when people ask you why you want to use MQ, there will be no awkward answer like "our team leader will use MQ and we will use it".
What are the advantages and disadvantages of using MQ?
When you see this problem, you can use it. The advantages have been mentioned above, but what are the disadvantages? There don't seem to be any shortcomings.
If you think so, you will be very wrong. In the process of designing the system, you should not only know clearly why you use this thing, but also think about the disadvantages of using it. Only in this way can we have confidence in our hearts and take precautions against them.
Next, let's discuss the disadvantages of using MQ.
Reduced system availability
Let's think about it. In the above decoupling scenario, the buddies of system A were supposed to send the key data of the system to the BC system, but now they have suddenly added a MQ, and now the BC system receives data through MQ.
But have you ever considered a question, what if MQ dies? This leads to the question of whether the availability of the system has been reduced after joining MQ.
Because of one more risk factor: MQ may die. As long as the MQ is dead and the data is gone, the system is not working correctly.
Increased complexity of the system
Originally, my system can be done by calling it through the interface, but after adding a MQ, I need to consider the problems of message repeated consumption, message loss, and even message ordering.
In order to solve these problems, we need to introduce a lot of complex mechanisms, so that the complexity of the system is not increased.
Data consistency problem
Originally, system A calls the BC system interface, and if the BC system goes wrong, it will throw an exception and return it to system A to let system A know, so that the rollback operation can be done.
But after using MQ, system A finished sending messages and thought it was a success. It just so happens that the C system fails to write the database, but A thinks C has been successful? In this way, the data are inconsistent.
After analyzing the advantages and disadvantages of introducing MQ, we can see that using MQ has many advantages, but we will find that the shortcomings it brings will require you to make up for all kinds of additional system designs.
* you may find that the whole system is several times more complex, so you have to make a choice based on these considerations when designing the system, and many times you will find that you still need to use it.
How to ensure that MQ messages are not lost?
After using MQ, you should also be concerned about the loss of messages. Here, let's choose RabbitMQ to explain.
The producer lost the data.
When the RabbitMQ producer sends the data to the rabbitmq, the data may be lost in the network transmission, and when the RabbitMQ does not receive the message, the message is lost.
RabbitMQ provides two ways to solve this problem:
Transaction mode:
Before the producer sends the message, start a transaction through `channel.txSelect` and then send the message
If the message is not successfully received by RabbitMQ, the producer will receive an exception, and you can roll back `channel.txRollback` in a transaction and then resend it. If RabbitMQ receives this message, you can commit the transaction `channel.txCommit`.
But as a result, the producer's throughput and performance will be much lower, which is not generally done now.
Another way is through the confirm mechanism:
This confirm mode is set where the producer is, that is, each time a message is written, a unique id is assigned, and then the RabbitMQ receives it and sends back an ack to tell the producer that the message is ok.
If the rabbitmq does not process the message, call back an nack interface, and the producer can resend it.
The difference between the transaction mechanism and the cnofirm mechanism is that the transaction mechanism is synchronous and blocks there after a transaction is committed.
But the confirm mechanism is asynchronous. After sending a message, you can send the next message. After receiving that message, rabbitmq will call you back asynchronously and an API will inform you that the message has been received.
Therefore, the confirm mechanism is generally used to avoid data loss in the producer.
Rabbitmq lost the data.
The RabbitMQ cluster will also lose messages, which is also mentioned in the official documentation tutorial, that is, after the message is sent to RabbitMQ, there is no landing disk by default. In case RabbitMQ goes down, the message will be lost.
So to solve this problem, RabbitMQ provides a persistence mechanism in which messages are persisted to disk after being written.
In this way, even if there is an outage, the previously stored data will be automatically recovered after recovery, which ensures that the message will not be lost.
There are two steps to setting up persistence:
* queue is set to be persistent when it is created, which ensures that rabbitmq persists the metadata of queue, but does not persist the data in queue.
The second is to set the deliveryMode of the message to 2 when sending the message, that is, to make the message persistent, and rabbitmq will persist the message to disk.
But then someone might say: what if the message dies and the data is lost before it can be persisted to disk after it is sent to RabbitMQ?
For this problem, it is guaranteed in conjunction with the above confirm mechanism, that is, the ack message will not be sent to the producer until the message is persisted to disk.
In case of that extreme situation, the producer can perceive it, and the producer can send a message to another RabbitMQ node by retrying.
The consumer lost the data
The situation in which the RabbitMQ consumer loses the data is like this: when consuming the message, as soon as you get the message and the process dies, RabbitMQ will think that you have successfully consumed it, and the data will be lost.
To solve this problem, first explain how RabbitMQ consumes messages: when consumers receive a message, they send an ack to RabbitMQ, telling RabbitMQ that the message has been consumed, so that RabbitMQ will delete the message.
But by default, the operation of sending ack is automatically submitted, that is, consumers will automatically return ack to RabbitMQ as soon as they receive the message, so there will be the problem of message loss.
So the solution to this problem is to turn off the auto-commit ack of the RabbitMQ consumer and manually submit the ack after the consumer has processed the message.
In this way, even if you encounter the above situation, RabbitMQ will not delete the message and will re-send the message after your program is restarted.
How to ensure the high availability of MQ?
After using MQ, we definitely want MQ to have high availability features, because it is impossible to accept the situation where messages cannot be sent and received when the machine is down.
We also explain this part based on the classic MQ of RabbitMQ:
RabbitMQ is representative, because it is based on master-slave high availability, so we will take him as an example to explain how to implement high availability of * MQ.
There are three modes of rabbitmq: stand-alone mode, ordinary cluster mode, and mirror cluster mode.
Stand-alone mode
Stand-alone mode is demo-level, which means that only one machine has deployed a RabbitMQ program.
This will have a single point of problem, downtime will be over, there is no high availability to speak of. Generally speaking, you start the play locally, and no one produces it in stand-alone mode.
General cluster mode
This mode means that multiple rabbitmq instances are started on multiple machines. Similar to the master-slave mode.
However, the created queue will only be placed on one master rabbtimq instance, and the other instances will synchronize the RabbitMQ metadata that receives the message.
When consuming messages, if the RabbitMQ instance you connect to is not the instance where the Queue data is stored, RabbitMQ will pull the data from the instance where the Queue data is stored and return it to the client.
Generally speaking, this approach is a bit troublesome and does not achieve true distribution. Consumers pull data each time they connect to an instance, which will incur additional performance overhead if they connect to an instance that is not storing queue data. If pulled from the instance where the Queue is placed, it will result in a performance bottleneck for a single instance.
If the instance with queue goes down, other instances will not be able to pull data, and the cluster will not be able to consume messages and achieve real high availability.
So this is rather awkward, there is no so-called high availability, this solution is mainly to improve throughput, that is, let multiple nodes in the cluster to serve the read and write operations of a queue.
Mirror cluster mode
The mirror cluster mode is the real high availability mode of rabbitmq. Unlike the ordinary cluster mode, the created queue will exist on multiple instances regardless of the metadata or the messages in the queue.
Every time you write a message to queue, it automatically synchronizes the message to the queue of multiple instances.
In this way, any other instance of a machine that goes down can be used to provide services, which makes it truly highly available.
But there are also disadvantages:
The performance overhead is too high, and messages need to synchronize all machines, which will lead to heavy network bandwidth pressure and consumption.
Low scalability: unable to solve the situation where the amount of queue data is particularly large, resulting in the linear expansion of queue.
Even with the addition of a machine, that machine will contain all the data of queue, and the data of queue will not be distributed.
The general approach for the high availability of RabbitMQ is to enable the image cluster mode, which at least achieves high availability. When one node goes down, other nodes can continue to provide services.
At this point, I believe you have a deeper understanding of "what are the interview questions of Java message middleware?" you might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.