Introduction and Construction method of RabbitMQ High availability Cluster 04/16 Update SLTechnology News&Howtos

Introduction and Construction method of RabbitMQ High availability Cluster

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article introduces the knowledge of "introduction and construction method of RabbitMQ high availability cluster". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Catalogue

A brief introduction to the cluster architecture

Second, ordinary cluster construction

2.1 each node installs RabbitMQ separately

2.2 add nodes to the cluster

2.3 Code demonstrates the problems of a general cluster

3. Image cluster

Fourth, the HAProxy environment is built.

Fifth, build the KeepAlived environment

A brief introduction to the cluster architecture

When the message processing ability of a single RabbitMQ server reaches a bottleneck, it can be expanded through RabbitMQ cluster to achieve the purpose of improving throughput. A RabbitMQ cluster is a logical grouping of one or more nodes. Each node in the cluster is peer-to-peer, and each node shares all users, virtual hosts, queues, switches, binding relationships, runtime parameters and other distributed state information. A highly available, load-balanced RabbitMQ cluster architecture should be similar to the following figure:

Parsing instructions:

The lowest layer is the RabbitMQ cluster, which is an ordinary cluster when there is no ha image. The disadvantage of an ordinary cluster is that if a machine is hung up, the queue based on this machine cannot be used (there is only one node for the data of a queue), and high availability cannot be achieved.

So turn the queue into a mirror queue, so that each node will have a complete data, and the use of some nodes will not be affected, thus achieving high availability, but the RabbitMQ cluster itself does not achieve load balancing, that is to say, for a three-node cluster

The load on each node may be different.

The function of HAProxy layer is to achieve load balancing of RabbitMQ cluster, but one node is obviously not highly available, so we need two HAProxy to achieve high availability of HaProxy, but can not achieve automatic failover, that is, HAProxy1 is down.

You need to manually change the ip address to HAProxy2.

So we need to use KeepAlived, which is usually composed of one master and one standby. Only the master node will provide external services at the same time, and provide a virtual IP address (Virtual Internet Protocol Address, referred to as VIP) at the same time, which can avoid exposing the real ip. If the primary node fails, the backup node automatically takes over the VIP and becomes the new primary node until the original primary node is restored.

The production environment architecture should be:

Machine 1:RabbitMQ1, machine 2:RabbitMQ2, machine 3:RabbitMQ3, machine 4:HAProxy+keeplived (master), machine 5 (HAProxy+keeplived).

There are only three machines for resource reasons, so the architecture is as follows:

172.16.2.84 (rabbit1) RabbitMQ1,HAProxy+keeplived (main) 172.16.2.85 (rabbit2) RabbitMQ2,HAProxy+keeplived (standby) 172.16.2.86 (rabbit3) RabbitMQ3 II, ordinary cluster building 2.1 each node installs RabbitMQ respectively

The function in the previous chapter here uses docker installation. Here the cluster does not use docker, because docker installation will have a lot of mapping, which is easy to disturb. If you have rabbitmq with docker, you need to disable docker first.

Here are three machines, 172.16.2.84 (rabbit1), 172.16.2.85 (rabbit2), 172.16.2.86 (rabbit3). To avoid confusion, we will call them rabbit1,rabbit2,rabbit3.

Rabbit1,rabbit2,rabbit3 performs the following installation once.

1) modify hostname before installation, because hostname is required for rabbitmq cluster communication

Rabbit1 machine execution

Hostnamectl set-hostname rabbit1-static

Rabbit2 machine execution

Hostnamectl set-hostname rabbit2-static

Rabbit3 machine execution

Hostnamectl set-hostname rabbit3-static

Rabbit1,rabbit2,rabbit3 does the same thing.

# modify hosts because rabbitmq communicates through hostname

Vi / etc/hosts

2) add the corresponding hostname of ip to the back

172.16.2.84 rabbit1

172.16.2.85 rabbit2

172.16.2.86 rabbit3

# restart network systemctl restart network# restart machine init 6

3) install Erlang

# step 1: run the erlang installation script provided by package cloud

Curl-s https://packagecloud.io/install/repositories/rabbitmq/erlang/script.rpm.sh | sudo bash

# step 2 install erlang

Yum install erlang

# step 3: check the erlang version number and enter erl directly on the command line

Erl

4) install RabbitMQ

# step 1: import two key first

Rpm-- import https://packagecloud.io/rabbitmq/rabbitmq-server/gpgkey

Rpm-- import https://packagecloud.io/gpg.key

# step 2: run the rabbitmq installation script provided by package cloud

Curl-s https://packagecloud.io/install/repositories/rabbitmq/rabbitmq-server/script.rpm.sh | sudo bash

# step 3 download the rabbit installation file

Wget https://github.com/rabbitmq/rabbitmq-server/releases/download/v3.9.5/rabbitmq-server-3.9.5-1.el8.noarch.rpm

# step 4

Rpm-- import https://www.rabbitmq.com/rabbitmq-release-signing-key.asc

# step 5 rabbitMQ dependency

Yum-y install epel-release

Yum-y install socat

# step 6 installation

Rpm-ivh rabbitmq-server-3.9.5-1.el8.noarch.rpm

# step 7: enable the management platform plug-in. After enabling the plug-in, you can visually manage RabbitMQ.

Rabbitmq-plugins enable rabbitmq_management

# step 8 start the application

Systemctl start rabbitmq-server

The version number of the above rabbitmq installation file address: https://github.com/rabbitmq/rabbitmq-server/releases/

5) set access permissions

# create an administrator account rabbitmqctl add_user admin 12345 set the registered account to authorize remote access to rabbitmqctl set_permissions-p / admin ". * # restart the service systemctl restart rabbitmq-server for administrator rabbitmqctl set_user_tags admin administrator#

Note that the firewall is off here. If the firewall is turned on, port 15672 needs to be opened.

# check firewall status systemctl status firewalld# turn off firewall systemctl stop firewalld

At this point, all three machines have been installed with RabbitMQ.

2.2 add nodes to the cluster

1) stop serving rabbit2,rabbit3

# stop all services systemctl stop rabbitmq-server

2) copy cookie

Copy the .erlang.cookie file on rabbit1 to the other two hosts. The cookie file is equivalent to a key token, and the RabbitMQ nodes in the cluster need to exchange key tokens to authenticate each other, so all nodes in the same cluster need to have the same key token, otherwise Authentication Fail errors will occur during the construction process. Just make sure that the key strings in the .erlang.cookie in the three machines are the same, and here copy the rabbit1 key to rabbit2,rabbit3.

All three machines give .erlang.cookie 400 permission.

# give 600th permission chmod 600 / var/lib/rabbitmq/.erlang.cookie

Rabbit1 machine

# Editing files

Vi / var/lib/rabbitmq/.erlang.cookie

Copy out the contents and change the value of the rabbit2,rabbit3 file to this.

Start the service

# enable all services systemctl start rabbitmq-server

3) Cluster building

The construction of RabbitMQ cluster needs to select any one of the nodes as the benchmark and gradually join the other nodes. Here we add rabbit2 and rabbit3 to the cluster with rabbit1 as the benchmark node. Execute the following command on rabbit2 and rabbit3:

# 1. Stop serving rabbitmqctl stop_app# 2. Reset state (executed when you need to change the node type, but not for the first time, unless your node joined the cluster as disk) rabbitmqctl reset# 3. Join the node # rabbitmqctl join_cluster-- ram rabbit@rabbit1rabbitmqctl join_cluster rabbit@rabbit1# 4. Start the service rabbitmqctl start_app

The join_cluster command has an optional parameter, ram, which means that the newly added node is a memory node, and the default is a disk node. If it is a memory node, all metadata for queues, switches, bindings, users, access rights, and vhost will be stored in memory, or on disk if it is a disk node. Memory nodes can have higher performance, but all configuration information will be lost after restart, so RabbitMQ requires at least one disk node in the cluster, and other nodes can be memory nodes. In most cases, the performance of RabbitMQ is sufficient and can be in the form of default disk nodes.

In addition, if a node is joined as a disk node, it needs to be reset using the reset command before joining an existing cluster, and the reset node deletes all historical resources and data that exist on that node. You can skip the reset step when joining in the form of memory nodes, because the data on memory itself is not persistent.

Operate the above, a common cluster is successfully built, open the rabbit management interface (any one is the same).

2.3 Code demonstrates the problems of a general cluster

In an ordinary cluster, when the queue is created for the first time, a node is randomly selected as the root node, which stores queue information (interactive machine, routing, queue name, etc.) and queue data. The other two nodes will only synchronize the meta-information of the root node (switch, routing, queue name), and so on.

However, the queue data is not stored, they find the root node to read and write messages through meta-information.

For example, if the cluster selects rabbit2 as the root node, there is no data stored in rabbit2,rabbit1 and rabbit3, so if rabbit2 goes down, the data in the queue will not be available.

Code demonstration, .NetCore5.0 reads the cluster connection code.

/ get the cluster connection object / public static IConnection GetClusterConnection () {var factory = new ConnectionFactory {UserName = "admin", / / account Password = "123456", / / password VirtualHost = "/" / virtual machine} List list = new List () {new AmqpTcpEndpoint () {HostName= "172.16.2.84", Port=5672}, new AmqpTcpEndpoint () {HostName= "172.16.2.85", Port=5672}, new AmqpTcpEndpoint () {HostName= "172.16.2.86", Port=5672}; return factory.CreateConnection (list);}

Producer Code:

/ work queue mode / public static void WorkerSendMsg () {string queueName = "worker_order" / / queue name / / create connection using (var connection = RabbitMQHelper.GetClusterConnection ()) {/ / create channel using (var channel = connection.CreateModel ()) {/ / create queue channel.QueueDeclare (queueName, durable: true, exclusive: false, autoDelete: false, arguments: null) IBasicProperties properties = channel.CreateBasicProperties (); properties.Persistent = true; / / message persistence for (var itemositani {/ / processing business var message = Encoding.UTF8.GetString (ea.Body.ToArray () Console.WriteLine ($"{I}, consumer: {index}, queue {queueName} consumption message length: {message.Length}"); Thread.Sleep (1000); channel.BasicAck (ea.DeliveryTag, false) / / message ack confirms, telling mq that the queue has been processed and that iTunes can be deleted from mq;}; channel.BasicConsume (queueName, autoAck: false, consumer);}

If the root node dies and sends data to the cluster, RabbitMQ will choose one of the rest as the root node. It is possible that multiple nodes have different queues of data.

At this point, you can see that a normal cluster is not highly available, the root node is down, and the queue on this node is not available. But when all three nodes are fine, the concurrency and queue load can be increased.

To achieve high availability, you need to use a mirror cluster.

3. Image cluster

A mirror cluster synchronizes a copy of mirror data on each node, which means that each node has a complete copy of data, so that some nodes are down and can provide RabbitMQ services normally.

Changing the image cluster is very simple, and it can be executed under any node on the basis of the above ordinary cluster.

# turn a cluster into a mirror cluster rabbitmqctl set_policy ha-all "^"'{"ha-mode": "all"}'

After execution, look back at the queue and see that there is more ha logo.

Fourth, the HAProxy environment is built.

1) download

The official download address for HAProxy is https://www.haproxy.org/. If the site is not accessible, it can also be downloaded from https://src.fedoraproject.org/repo/pkgs/haproxy/.

# download haproxywget https://www.haproxy.org/download/2.4/src/haproxy-2.4.3.tar.gz

Decompression

# decompress tar xf haproxy-2.4.3.tar.gz

2) compilation

Install gcc

Yum install gcc

Go to the extracted root directory and execute the following compilation command:

# execute make TARGET=linux-glibc PREFIX=/usr/app/haproxy-2.4.3make install PREFIX=/usr/app/haproxy-2.4.3 after entering the haproxy-2.4.3 directory

3) configure environment variables

# Editing the file vi / etc/profile# to add these two items to export HAPROXY_HOME=/usr/app/haproxy-2.4.3export PATH=$PATH:$HAPROXY_HOME/sbin

Make the configured environment variables take effect immediately:

Source / etc/profile

4) load balancing configuration

Create a new configuration file haproxy.cfg. The location I created here is: / etc/haproxy/haproxy.cfg. The contents of the file are as follows:

Global

# Log output configuration. All logs are recorded locally and output through local0

Log 127.0.0.1 local0 info

# maximum number of connections

Maxconn 4096

# change the current working directory

Chroot / usr/app/haproxy-2.4.3

# run the haproxy process with the specified UID

Uid 99

# run the haproxy process with the specified GID

Gid 99

# run as a daemon

Daemon

# location of the pid file of the current process

Pidfile / usr/app/haproxy-2.4.3/haproxy.pid

# default configuration

Defaults

# apply global log configuration

Log global

# use 4-tier proxy mode and 7-tier proxy mode is "http"

Mode tcp

# Log category

Option tcplog

# do not record the log information of health check

Option dontlognull

# for 3 failures, the service is considered unavailable

Retries 3

# maximum number of connections available per process

Maxconn 2000

# connection timeout

Timeout connect 5s

# client timeout

Timeout client 120s

# Server timeout

Timeout server 120s

# binding configuration

Listen rabbitmq_cluster

Bind: 5671

# configure TCP mode

Mode tcp

# using weighted polling for load balancing

Balance roundrobin

# RabbitMQ Cluster Node configuration

Server rabbit1 rabbit1:5672 check inter 5000 rise 2 fall 3 weight 1

Server rabbit2 rabbit2:5672 check inter 5000 rise 2 fall 3 weight 1

Server rabbit3 rabbit3:5672 check inter 5000 rise 2 fall 3 weight 1

# configure Monitoring Page

Listen monitor

Bind: 8100

Mode http

Option httplog

Stats enable

Stats uri / stats

Stats refresh 5s

Upload it, remember to open it to see if the newline character has these symbols, if you want to delete them, otherwise you will report an error.

Load balancer is mainly configured under listen rabbitmq_cluster. Here, weighted polling is specified for load balancer, and health check mechanism is defined:

Server rabbit1 rabbit1:5672 check inter 5000 rise 2 fall 3 weight 1

The above configuration means that the rabbit1 node with the address rabbit1:5672 is checked every 5 seconds. If the results of two consecutive checks are normal, the node is considered available, and the client's request can be polled to the node. If the results of three consecutive checks are abnormal, the node is considered unavailable. Weight is used to specify the weight of the node during the polling process.

5) start the service

Haproxy-f / etc/haproxy/haproxy.cfg

You can view it on the monitoring page after startup. The port is 8100 and the full address is: http://172.16.2.84:8100/stats. The page is as follows:

All nodes are green, which means that the nodes are healthy. At this point, it is proved that the HAProxy has been successfully built and the RabbitMQ cluster has been monitored.

The load balancing of RabbitMQ has been implemented here. How can the code connect to the Rabbit cluster through Haproxy? because the port exposed by Haproxy is 5671, Ip is the ip:5671 of Haproxy.

Connection code, either through haproxy1 or haproxy2:

Public static IConnection GetConnection () {ConnectionFactory factory = new ConnectionFactory {HostName = "172.16.2.84", / / haproxy ip Port = 5671 Grammar haproxy port UserName = "admin", / / account Password = "123456" / / password VirtualHost = "/" / / virtual host} Return factory.CreateConnection ();} 5. Build the KeepAlived environment

Then Keepalived can be built to solve the problem of HAProxy failover. Here I install KeepAlived on rabbit1 and rabbit2. The steps for building on the two hosts are exactly the same, except that some configurations are slightly different, as shown below:

Official website: https://www.keepalived.org

1) installation

Yum install-y keepalived

2) modify the configuration file

After keepalived is installed, the configuration file is generated in / etc/keepalived/keepalived.conf

First, modify the keepalived.conf configuration file on keepalived1. The complete content is as follows:

Global_defs {# routing id, master / slave nodes cannot be the same router_id node1} # Custom monitoring script vrrp_script chk_haproxy {# script location script "/ etc/keepalived/haproxy_check.sh" # the role of interval 5 weight 10} vrrp_instance VI_1 {# Keepalived, where MASTER represents the master node BACKUP means the backup node state MASTER # specifies the monitoring network card. You can use ip addr to view the id of the interface ens33 # virtual route. The primary and standby nodes need to be set to the same virtual_router_id 1 # priority, and the priority of the primary node needs to be set higher than that of the backup node. Set the check time between the active and standby nodes. Unit: advert_int 1 # define authentication type and password authentication {auth_type PASS auth_pass 123456} # call the above custom monitoring script track_script {chk_haproxy} virtual_ipaddress {# virtual IP address, you can set multiple 172.16.2.200}}

The above configuration defines the Keepalived node on the keepalived1 as the MASTER node, and sets the virtual IP to provide services to 172.16.2.200. In addition, the most important thing is to define the monitoring of HAProxy through haproxy_check.sh. This script needs to be created by ourselves, as follows:

#! / bin/bash# determines whether haproxy has started if [`ps-C haproxy-- no-header | wc-l`-eq 0] If then # does not start, start haproxy-f / etc/haproxy/haproxy.cfgfi# to sleep for 3 seconds so that haproxy fully starts sleep. If haproxy is still not started, you need to stop the local keepalived service so that VIP can automatically drift to another haproxyif [`ps-C haproxy-- no-header | wc-l`-eq 0]; then systemctl stop keepalivedfi

Give it execution permission after creation:

Chmod + x / etc/keepalived/haproxy_check.sh

This script is mainly used to determine whether the HAProxy service is normal. If it is abnormal and cannot be started, the native Keepalived needs to be turned off so that the virtual IP drifts to the backup node.

The configuration of the backup node (keepalived2) is basically the same as that of the primary node, but its state needs to be changed to BACKUP; and its priority priority needs to be lower than the primary node. The complete configuration is as follows:

Global_defs {# route id, master / slave nodes cannot be the same router_id node2} vrrp_script chk_haproxy {script "/ etc/keepalived/haproxy_check.sh" interval 5 weight 10} vrrp_instance VI_1 {# BACKUP represents backup node state BACKUP interface ens33 virtual_router_id 1 # priority The backup node is lower than the primary node by priority 50 advert_int 1 authentication {auth_type PASS auth_pass 123456} track_script {chk_haproxy} virtual_ipaddress {172.16.2.200}}

The haproxy_check.sh file is the same as keepalived1

3) start the service

Start the KeepAlived service on KeepAlived1 and KeepAlived2 with the following command:

Systemctl start keepalived

When the keepAlived1 is the master node after startup, you can use the ip a command on keepAlived1 to see the virtual IP:

At this point, virtual IP exists only on keepAlived1, but not on keepAlived2.

4) verify failover

Here we verify the failover, because according to our detection script above, if HAProxy has been stopped and cannot be restarted, the KeepAlived service will stop. Here, we directly use the following command to stop the Keepalived1 service:

Systemctl stop keepalived

At this point, using ip a to view separately, you can find that the VIP on keepalived1 has drifted to keepalived2, as shown below:

At this point, the VIP for the external service is still available, indicating that the failover has been successful. So far, the cluster has been successfully built. Any client service that needs to send or receive messages only needs to connect to the VIP. The example is as follows:

Public static IConnection GetConnection () {ConnectionFactory factory = new ConnectionFactory () {HostName = "172.16.2.200", / / vip Port = 5671 Mahaproxy port UserName = "admin", / / account Password = "123456" / / password VirtualHost = "/" / / virtual host} Return factory.CreateConnection ();} "introduction to RabbitMQ High availability Cluster and how to build it" ends here. Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.