Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to realize the highly available architecture of master / standby switching in seconds in case of failure by etcd

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly explains "how etcd realizes the high availability architecture of master / standby second switching in case of failure". The explanation in this article is simple and clear, and is easy to learn and understand. Please follow the editor's train of thought to study and learn "etcd how to achieve master / standby second switching high availability architecture in case of failure".

What is Etcd?

Etcd is a powerful and consistent distributed key-value storage that provides a reliable way to store data that needs to be accessed by distributed systems or machine clusters. It gracefully handles leader elections during network partitions and can tolerate machine failures, even in leader nodes. From simple Web applications to Kubernetes, any complex application can read data and write it to etcd. This is the official description of Etcd. Based on these features, Etcd is often used for distributed configuration, distributed locking, distributed service coordinator, and distributed registration. Functionally, zookeeper is a kind of project, but compared with etcd, etcd is more modern. Etcd is developed in go language, and the system executable binary product is generated after compilation, which is more cross-platform and easier to maintain. Etcd directly provides the interface of http, which is very convenient for each language to encapsulate its own client sdk, and is a little better in terms of ease of use. Below, the client jetcd of java is mainly used to solve the problem of coordination between master and slave services.

Etcd project address: https://github.com/etcd-io/etcd

Etcd official website: https://etcd.io

Jetcd address: https://github.com/etcd-io/jetcd

Description of active and standby service scenarios

In many cases, in order to make the service highly available, in addition to having a working primary service, several more backup services need to be enabled, so that when the primary service fails, the standby service can be topped up immediately. An obvious feature of this scenario is that there can be only one primary service at a time. In common cases, such as mysql master-slave switching, only one msyql can be responsible for writing data at a time. In our scenario, there is a binlog parsing service that parses the binlog of mysql in real time and passes the parsed data to kafka. The kafka consumer has a Flink job to consume the parsed data. Eventually, these data will go down to the data center, and provide the basic business data to the center platform system. Many online services query the data parsed by the source binlog, so the service parsed by binlog cannot have a single point of failure. In architecture, it can only be a primary and multi-standby mode. When the primary service fails, the standby service is topped in real time. At the same time, binlog services cannot have multiple parsing at the same time. Therefore, it would be better for this scenario to use etcd as the active and standby architecture.

The specific implementation of jetcd first introduces the jetcd dependency io.etcd jetcd-core 0.3.0 initialization client Client client = Client.builder () .endpoints ("http://127.0.0.1:2379"," http://127.0.0.1:3379", "http://127.0.0.1:4379") .build () Key api introduces Lock lock = client.getLockClient (); Lease lease = client.getLeaseClient ()

Lease provides methods for granting, revoking, and maintaining leases, and there are two key methods, grant (long ttl) and keepAlive (). Grant is used to grant the lease. The input parameter is the time of the lease, that is, if the key value with the lease is created, it will be deleted automatically after ttl seconds, and the id of the lease will be returned. The keepAlive () method is used to keep the lease valid, that is, if the lease is about to expire, keepAlive can automatically renew the lease for ttl time.

There are two methods for Lock, lock (ByteSequence name, long leaseId) and unlock (ByteSequence lockKey). To implement the distributed lock function. When locking is added, the input parameter leaseid is the id of the renewal object, that is, the time to hold the lock is defined.

Through the functions of Lease and Lock, it is easy to switch between active and standby services. The key code is as follows:

ByteSequence lockKey = ByteSequence.from ("/ root/lock", StandardCharsets.UTF_8); Lock lock = client.getLockClient (); Lease lease = client.getLeaseClient (); long leaseId = lease.grant (lockTTl). Get (). GetID () Lease.keepAlive (leaseId, new StreamObserver () {@ Override public void onNext (LeaseKeepAliveResponse value) {System.err.println ("LeaseKeepAliveResponse value:" + value.getTTL ());} @ Override public void onError (Throwable t) {t.printStackTrace ();} @ Override public void onCompleted () {}}) Lock.lock (lockKey, leaseId). Get (). GetKey ()

First of all, apply for renewal to get leaseId, where lockttl is 1, unit second, etcd lease is seconds. The ttl setting here is fastidious, depending on how quickly you want to make the slave service aware and top up when the master service fails. Of course, due to the second limit of etcd's own lease, the fastest can only be 1 second.

Then call the keepAlive method to keep the granted leaseid alive, so that the contract will be automatically renewed as long as the application is still alive

Then call the lock method, passing in leaseid. Only the service started for the first time will acquire the lock, and the contract will be renewed continuously during the run. When running here from the service, it will be blocked. This ensures that multiple services are running at the same time, with only one service actually working for the purpose. When there is a problem with the master service that acquired the lock, the original renewal of the lock will expire within 1 second, and the slave service will immediately obtain the lock execution work code.

Complete test case / * * @ author: kl @ kailing.pub * @ date: 2019-7-22 * / public class JEtcdTest {private Client client; private Lock lock; private Lease lease; / / Unit: second private long lockTTl = 1; private ByteSequence lockKey = ByteSequence.from ("/ root/lock", StandardCharsets.UTF_8); private ScheduledExecutorService scheduledThreadPool = Executors.newScheduledThreadPool (2) @ Before public void setUp () {client = Client.builder () .endpoints ("http://127.0.0.1:2379"," http://127.0.0.1:3379", "http://127.0.0.1:4379") .build (); lock = client.getLockClient (); lease = client.getLeaseClient () } @ Test public void lockTest1toMaster () throws InterruptedException, ExecutionException {long leaseId = lease.grant (lockTTl). Get () .getID (); lease.keepAlive (leaseId, new StreamObserver () {@ Override public void onNext (LeaseKeepAliveResponse value) {System.err.println ("LeaseKeepAliveResponse value:" + value.getTTL ()) } @ Override public void onError (Throwable t) {t.printStackTrace ();} @ Override public void onCompleted () {}}); lock.lock (lockKey, leaseId) .get () .getKey () ScheduledThreadPool.submit (()-> {while (true) {System.err.println ("I am the main service starting to work"); TimeUnit.SECONDS.sleep (1);}}); TimeUnit.DAYS.sleep (1) } @ Test public void lockTest2toStandby () throws InterruptedException, ExecutionException {long leaseId = lease.grant (lockTTl). Get () .getID (); lease.keepAlive (leaseId, new StreamObserver () {@ Override public void onNext (LeaseKeepAliveResponse value) {System.err.println ("LeaseKeepAliveResponse value:" + value.getTTL ()) } @ Override public void onError (Throwable t) {t.printStackTrace ();} @ Override public void onCompleted () {}}); lock.lock (lockKey, leaseId) .get () .getKey () ScheduledThreadPool.submit (()-> {while (true) {System.err.println ("I'm a standby service, I'm starting to work, and it's estimated that the primary service is down"); TimeUnit.SECONDS.sleep (1);}}); TimeUnit.DAYS.sleep (1) } @ Test public void lockTest3toStandby () throws InterruptedException, ExecutionException {long leaseId = lease.grant (lockTTl). Get () .getID (); lease.keepAlive (leaseId, new StreamObserver () {@ Override public void onNext (LeaseKeepAliveResponse value) {System.err.println ("LeaseKeepAliveResponse value:" + value.getTTL ()) } @ Override public void onError (Throwable t) {t.printStackTrace ();} @ Override public void onCompleted () {}}); lock.lock (lockKey, leaseId) .get () .getKey () ScheduledThreadPool.submit (()-> {while (true) {System.err.println ("I'm a standby service, I'm starting to work, and it's estimated that the primary service is down"); TimeUnit.SECONDS.sleep (1);}}); TimeUnit.DAYS.sleep (1);}}

The above test case simulates a highly available architecture with one master and two backups. Execute the lockTest1toMaster (), lockTest2toStandby (), and lockTest3toStandby () services, respectively, and you'll find that only one service can print. Then turn off the service manually, and the slave service will continue to print immediately. After turning off this slave service, the other slave service will continue to print. The effect of active and standby failover is well simulated.

Thank you for your reading. The above is the content of "how to achieve master / slave second switching high availability architecture in the event of a failure in etcd". After the study of this article, I believe you have a deeper understanding of how to achieve master / standby second switching high availability architecture in the event of a failure, and the specific use situation still needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report