Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to realize non-stop upgrade of large-scale RocketMQ Cluster in online Environment

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly introduces "how to achieve online environment large-scale RocketMQ cluster non-stop upgrade". In daily operation, I believe many people have doubts about how to achieve online environment large-scale RocketMQ cluster non-downtime upgrade. Xiaobian consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubt of "how to achieve online environment large-scale RocketMQ cluster non-stop upgrade". Next, please follow the editor to study!

1. The urgency of version upgrade

It is a shame that, as an excellent evangelist in the RocketMQ community, the RocketMQ server version of my company is still 4.1.0, RocketMQ does not support ACL (access control) before version 4.4.0, any machine in the production environment can subscribe to any topic, and any production application server can install a rocketmq-console, thus controlling the whole cluster and having the authority to delete topics and consumer groups. Think about whether there is a chill in the back.

2. Upgrade scheme

2.1 determine the version to upgrade to

Looking through the RocketMQ upgrade log, RocketMQ officially introduced the ACL mechanism in version 4.4.0, so the version should be upgraded to at least 4.4.0. There is an unwritten rule for using the open source version in the industry: usually do not use the latest version, do not act as a guinea pig.

But RocketMQ can be regarded as a special.

By carefully scanning the version change record of RocketMQ, it is not difficult to find that there are very few changes related to RocketMQ Client, that is, the code of message sending and message consumption which is closely related to users is very stable, and there are basically no compatibility problems in theory. And each version has fixed some major BUG, and the performance improvement is quite obvious, so the author decided to "risk universal condemnation" and decided to help upgrade to the latest version 4.8.0.

Let's talk a little bit here and briefly introduce several versions of RocketMQ that have the significance of mileage cup.

RocketMQ4.3.0 formally introduces transaction messages, and if you want to use transaction messages, the minimum version is recommended to be 4.6.1.

RocketMQ4.4.0 introduces ACL, message tracking, and if you need to use these features, the minimum recommended version is 4.7.0.

RocketMQ4.5.0 introduces multiple copies (master-slave switching), and its version recommends 4.7.0.

RocketMQ4.6.0 introduces a request-response model.

2.2 upgrade ideas

The basic requirement of version upgrade: the business cannot be downtime, that is, it is necessary to upgrade the business without awareness.

If the machine has enough backup machines, the best version migration solution should be to expand the capacity first and then reduce it. The example figure is as follows:

The main idea is to expand the capacity of Broker, add two high-version Broker servers to the cluster, then turn off the write permission of the lower version of Broker, remove the lower version after the message expires, and finally upgrade NameServer to complete the online migration without downtime.

Since this upgrade requires all the nodes of the RocketMQ cluster to be upgraded in about half a month, so there are not so many cold backup nodes, so the capacity expansion and then reduction cannot meet the demand. This upgrade can only be based on existing machines.

Can you upgrade the Broker code directly, but the higher version of Broker directly uses the lower version of the Broker storage directory, that is, directly upgrade the software. The example figure is as follows:

The core idea is to stop the old version of Broker and then start Broker with the new version, but use the old configuration file.

With the idea, the next step is to verify the feasibility of the scheme.

2.3 Scheme verification

According to the theory, there must be sufficient testing and verification before any changes are made to the production environment, and version upgrades focus on verifying compatibility issues.

2.2.1 Server version compatibility verification

To build a MQ cluster mentioned above, the core points are as follows:

Can a higher version of Broker register a route with a lower version of NameServer?

Can a lower version of Broker register a route with a higher version of NameServer?

Through rocketmq-console, to create multiple topic to see if their routing information is correct, verified, in line with expectations.

2.2.2 client-server compatibility verification

As a matter of fact, the client API of RocketMQ is relatively simple, which is nothing more than message sending, batch sending and message consumption. since version 4.1 does not support transaction messages, this upgrade does not even need to verify transaction messages.

Can the lower version of the client send messages and consume messages to the higher version of Broker?

Can a higher version of the client send messages and consume messages to a lower version of Broker?

In fact, we don't need to write the test case from anywhere. We can just use the official Demo. The code screenshot is as follows:

In the actual implementation process, client verification is actually much more complex than server-to-server verification. As each project team uses different client versions, and some project teams even use other non-Java clients such as C++ and Python, how to accurately find the connection information (client version, language type) of all clients in the cluster is very important.

The official version is relatively friendly to the connection information of consumer groups. We can first query all consumer groups in the system by writing scripts, and then traverse each consumer group. We can query the IP address, client version, language and other information of these consumer groups, but the open source version is not friendly to producers, and there is no interface to get all the relevant senders.

The connection mode of the consumer end of the consumer group is shown in the following figure:

Therefore, our approach is mainly based on the type of failed client of the consumer group. During this upgrade, I have also made some customized development of RocketMQ to facilitate access to the link information of all senders, which will be submitted to the official PR later.

2.2.3 Storage format verification on Broker side

As there are no free resources, the upgrade method to be used this time is to upgrade the software directly. However, the shared storage directory of the new and old versions and the RocketMQ-based message storage protocol have not changed since version 4.0.0. The key points of its verification are as follows:

Can version 4.8.0 directly use the storage files generated by 4.1.0 (commitlog, etc.)

Can version 4.1.0 directly use the storage files generated by 4.8.0?

Why do you need to verify that version 4.1.0 is compatible with 4.8.0? Because if the upgrade fails, it needs to be rolled back, and if version 4.1.0 is not compatible with 4.8.0, there will be no way out, which is absolutely not allowed in architecture design.

After verification, it is found that the storage files are compatible with each other.

2.2.4 Test environment verification

After the verification of the above three steps, the upgrade can be carried out, but before the upgrade, the test environment can be upgraded to the following architecture after one day of stable operation of the test environment:

That is, different versions of the mashup mode are verified by all application servers in the test environment, and if there is no problem with the test environment, it can be upgraded in the production environment.

2.4 implementation plan

With the above upgrade scheme, and has been fully verified, it can be implemented in the production environment. Before execution, it is necessary to implement an executable implementation scheme for the theoretical design output, which must include a rollback operation. And this rollback operation must be relatively easy to implement, otherwise your plan must not be so reliable.

Next, it focuses on some key steps in the implementation process, and the whole upgrade step has a rolling upgrade, that is, upgrade one by one.

1. Disable the write permission of a Broker

Turn off the Broker write permission to allow the application to smoothly migrate traffic to other nodes, which can effectively avoid the impact on the business when the machine is restarted.

Sh. / mqadmin updateBrokerConfig-b 192.168.x.x:10911-n 192.168.xx.xx:9876-k brokerPermission-v 4

2. Close broker when writing with Broker and consuming tps is close to 0

Ps-ef | grep java kill pid

3. Start Broker with the new version

Note that the configuration file used in this process is the old version, so write permission is not enabled at this time, and startup does not affect client message writing.

4. Enable write permission

After the new version starts successfully, write permission can be enabled.

Sh. / mqadmin updateBrokerConfig-b 192.168.xx.xx:10911-n 192.168.xx.xx:9876-k brokerPermission-v 6

Observe the flow.

Repeat the above steps to complete the upgrade of Broker.

The upgrade of Nameserver is even easier, with a rolling upgrade, kill drops the old version of nameserver, and starts the new version of nameserver on the original machine.

At this point, the study on "how to realize the non-stop upgrade of large-scale RocketMQ clusters in online environment" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

  • What is the custom button style in the css3 fillet style

    Css3 fillet style in the custom button style, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain in detail for you, people with this need can come to learn, I hope you can gain something. The code is as follows:

    © 2024 shulou.com SLNews company. All rights reserved.

    12
    Report