How to do it, RabbitMQ? 02/11 Update SLTechnology News&Howtos

How to do it, RabbitMQ?

2026-02-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly introduces "how to do RabbitMQ". In daily operation, I believe many people have doubts about how to do RabbitMQ. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful to answer the doubts of "how to do RabbitMQ"! Next, please follow the editor to study!

I. background note

Vivo introduced RabbitMQ in 2016 and extended based on open source RabbitMQ to provide message middleware services to the business.

From 2016 to 2018, all businesses use one cluster. With the growth of business scale, the cluster load becomes heavier and heavier, and cluster failures occur frequently.

In 2019, RabbitMQ entered the high-availability construction phase, completing the high-availability component MQ name service and the city-to-city dual-live construction of RabbitMQ clusters.

At the same time, the physical division of the business usage cluster is carried out, and the distribution and dynamic adjustment of the business usage cluster are carried out strictly according to the cluster load and business traffic.

Since the high availability construction in 2019, the business traffic has increased tenfold, and there has been no serious failure in the cluster.

RabbitMQ is an open source message broker software that implements the AMQP protocol, which originated from the financial system.

Has a wealth of features:

Message reliability is guaranteed. RabbitMQ ensures the reliability of message delivery by sending confirmation, ensuring the reliability of message in the cluster through clustering, message persistence and mirror queue, and ensuring the reliability of message consumption through consumption confirmation.

RabbitMQ provides clients in multiple languages

Various types of exchange are provided. Messages are sent to the cluster and routed to the specific queue through exchange.

RabbitMQ provides perfect management background and management API, which can be quickly integrated with self-built monitoring system by managing API.

The problems found by RabbitMQ in the specific practice:

In order to ensure high business availability, multiple clusters are used for physical isolation, and multiple clusters are managed without a unified platform.

Native RabbitMQ clients use cluster addresses to connect. When using multiple clusters, businesses need to care about cluster addresses, resulting in confusion.

Native RabbitMQ only has simple user name / password authentication, and does not authenticate the business application side used. Different businesses are easy to mix exchange/queue information, resulting in abnormal use of business applications.

A large number of business applications are used, and there is no platform to maintain the related information of the sender and consumer of the message. It is impossible to determine the counterparty after iterations of multiple versions.

Client unlimited flow, business sudden abnormal traffic impact or even destroy the cluster

The client has no exception message resending policy, which needs to be implemented by the user.

When a cluster is blocked due to memory overflow, it is impossible to quickly and automatically transfer to other available clusters.

Using a mirror queue, the master node of the queue will fall on a specific node. When there are a large number of queues in the cluster, it is easy to cause node load imbalance.

RabbitMQ does not have the ability of automatic queue balancing, so it is easy to cause uneven load of cluster nodes when there are many queues.

II. Overall structure

1. MQ-Portal-- supports application for application.

In the past, when the business team applied RabbitMQ, the application application traffic and interfacing applications and other information were recorded offline in the form, which was scattered and not updated in time, so it was impossible to accurately understand the current real use of the business. Therefore, the metadata information used by the application was established through a visual and platform access application process.

Through the application process of MQ-Portal (as shown in the figure above), it is determined that the application for sending messages, consuming applications, using exchange/queue, sending traffic and other information will be submitted to the internal work order process of vivo for approval.

After the approval of the work order process, the application cluster is assigned through the callback of the ticket API, and the exchange/queue binding relationship is created on the cluster.

Due to the use of multi-cluster physical isolation to ensure the high availability of business in the formal environment, it is impossible to simply locate the used cluster through the name of an exchange/queue.

Each exchange/queue and cluster is associated with rmq.secret.key through a unique pair of rmq.topic.key, so that the specific cluster can be located during the SDK startup process.

Rmq.topic.key and rmq.secret.key will be assigned in the callback API of the work order.

2. Overview of client SDK capabilities

Client SDK is encapsulated based on spring-message and spring-rabbit, and on this basis, it provides application authentication, cluster addressing, client current restriction, production and consumption reset, blocking transfer and other capabilities.

1) Application usage authentication

Open source RabbitMQ only determines whether to connect to the cluster by user name and password, but whether the application allows the use of exchange/queue is not verified.

In order to avoid mixed use of exchange/queue in different services, it is necessary to authenticate the application.

Application authentication is done jointly by SDK and MQ-NameServer.

When the application starts, the rmq.topic.key information of the application configuration is first reported to the MQ-NameServer, and the MQ-NameServer determines whether the application is consistent with the application application, and it will be checked again when the SDK sends the message.

/ * check before sending, and get the real sending factory, so that the business can declare multiple messages, * but all messages can be sent with one of the bean, and will not cause any exception * @ param exchange check parameter * @ return sending factory * / public AbstractMessageProducerFactory beforeSend (String exchange) {if (closed | | stopped) {/ / context has been turned off to throw an exception Prevent further sending and reduce sending critical state data throw new RmqRuntimeException (String.format ("producer sending message to exchange% s has closed, can't send message", this.getExchange () } if (exchange.equals (this.exchange)) {return this;} if (! VIVO_RMQ_AUTH.isAuth (exchange)) {throw new VivoRmqUnAuthException ("send topic check exception, do not send data to unauthorized exchange% s, send failed", exchange);} / / get the real sent bean to avoid sending error return PRODUCERS.get (exchange) }

2) Cluster addressing

As mentioned earlier, applications use RabbitMQ to allocate clusters strictly according to the load and business traffic of the cluster, so the different exchange/queue used by a specific application may be allocated to different clusters.

In order to improve the efficiency of business development, it is necessary to shield the impact of multiple clusters on the business, so the cluster is automatically addressed according to the rmq.topic.key information configured by the application.

3) current limit on client side

Native SDK clients do not send traffic restrictions, and when some applications continue to send messages to MQ, the MQ cluster may be destroyed. And a cluster is commonly used by multiple applications, and the cluster impact caused by a single application will affect all applications that use abnormal clusters.

Therefore, it is necessary to provide the ability of the client to limit the current in the SDK. If necessary, you can restrict the application from sending messages to the cluster to ensure the stability of the cluster.

4) reset production and consumption

① with the growth of business scale, the load of the cluster continues to increase, so it is necessary to split the business of the cluster. In order to reduce the need to avoid business restart during the split process, the production and consumption reset function is needed.

An exception in the ② cluster may cause consumers to drop the line. At this time, business consumption can be quickly pulled up by resetting production and consumption.

In order to reset production and consumption, you need to implement the following process:

Reset connection factory connection parameters

Reset connection

Make a new connection

Restart production and consumption.

CachingConnectionFactory connectionFactory = new CachingConnectionFactory (); connectionFactory.setAddresses (address); connectionFactory.resetConnection (); rabbitAdmin = new RabbitAdmin (connectionFactory); rabbitTemplate = new RabbitTemplate (connectionFactory)

At the same time, there is an exception message resending policy in MQ-SDK, which can avoid abnormal message delivery caused by production reset.

5) blocking transfer

RabbitMQ blocks sending messages when memory usage exceeds 40%, or when disk usage exceeds the limit.

Since the vivo middleware team has completed the construction of RabbitMQ dual-live users in the same city, the rapid transfer of blocking can be completed by resetting production and consumption to the dual-active cluster when there is a cluster transmission blocking.

6) Multi-cluster scheduling

With the development of the application, the single cluster will not be able to meet the traffic needs of the application, and the cluster queues are all mirror queues, so it is impossible to realize the horizontal expansion of business support traffic single cluster simply by adding cluster nodes.

Therefore, SDK is required to support multi-cluster scheduling capability to meet the needs of large business traffic by distributing traffic to multiple clusters.

3. MQ-NameServer-- supports MQ-SDK for fast fault switching.

MQ-NameServer is a stateless service that ensures its high availability through cluster deployment. It is mainly used to solve the following problems:

MQ-SDK starts authentication and applications use cluster positioning

Handle the scheduled metrics reporting of MQ-SDK (number of messages sent, number of messages consumed), and return the current available cluster address to ensure that SDK reconnects according to the correct address when the cluster is abnormal.

Control MQ-SDK to reset production and consumption.

4. MQ-Server high availability deployment practice

RabbitMQ clusters adopt the dual-active deployment architecture in the same city, and rely on the cluster addressing and fast failover capabilities provided by MQ-SDK and MQ-NameServer to ensure the availability of the cluster.

1) treatment of brain fissure in clusters.

RabbitMQ officially offers three strategies for cluster brainfissure recovery.

① ignore

Ignore the problem of brain fissure and do not deal with it, and human intervention is needed to recover when brain fissure occurs. Due to the need for human intervention, some messages may be lost, which can be used when the network is very reliable.

② pause_minority

When a node loses contact with more than half of the cluster nodes, it will automatically pause until communication with more than half of the cluster nodes is detected. In extreme cases, all nodes in the cluster are suspended, making the cluster unavailable.

③ autoheal

The minority node will restart automatically, and this strategy is mainly used to give priority to ensuring the availability of the service, rather than the reliability of the data, because messages on the restart node will be lost.

Since all RabbitMQ clusters are deployed in the same city, even abnormal business traffic in a single cluster can be automatically migrated to a dual-live server room cluster, so the pause_minority strategy is chosen to avoid brain fissure.

In 2018, the cluster brain fissure was caused by network jitter many times. After modifying the cluster brain fissure recovery strategy, the problem of brain fissure no longer appeared.

2) Cluster high availability scheme

RabbitMQ adopts cluster deployment, and because the cluster brain fissure recovery strategy adopts pause_minority mode, each cluster requires at least 3 nodes.

It is recommended to deploy highly available clusters with 5 or 7 nodes and control the number of cluster queues.

Cluster queues are all mirror queues to ensure that messages are backed up to avoid message loss caused by node anomalies.

Exchange, queue and messages are all set to persistence to avoid loss of node abnormal restart messages.

Queues are set to lazy queues to reduce the fluctuation of node memory usage.

3) the construction of double living in the same city

The equivalent cluster is deployed in the dual computer room, and the two clusters are formed into an alliance cluster through the Federation plug-in.

The application machines in this computer room are preferred to connect to the MQ cluster in this computer room to avoid abnormal application usage caused by direct connect jitter.

Obtain the latest available cluster information through the MQ-NameServer heartbeat, and reconnect to the double-active cluster in case of an exception to achieve rapid recovery of application functions.

III. Challenges and prospects for the future

At present, the enhancement of the use of RabbitMQ is mainly on the MQ-SDK and MQ-NameServer side, and the implementation of SDK is more complex. In the later stage, we hope to build a proxy layer of message middleware, which can simplify SDK and manage business traffic in more detail.

At this point, the study of "how to do RabbitMQ" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.