Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the design concept and customized development idea of Canal Instance?

2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article will explain in detail about Canal Instance design concepts and customized development ideas, the content of the article is of high quality, so the editor will share it with you for reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.

Instance is the core of Canal data synchronization. In a Canal instance, data synchronization can only be achieved by starting Instace. What kind of person is Instance? it tries to unravel the mystery of Instance by means of source code.

1. The important class descriptions of Canal Instance class inheritance system are as follows:

CanalInstanceCanal Instance API, which defines the basic characteristics of Instance, mainly defines the following methods:

String getDestination ()

The destination name of the instance, which represents a source instance name in Canal and corresponds to a MySQL instance information, such as 192.168.1.3 virtual 3306. Here, a name is given to the instance.

CanalEventParser getEventParser ()

The event parser, the Binlog parser, is responsible for parsing the binlog log.

CanalEventSink getEventSink ()

The connector between EventParse and EventStore mainly deals with the filtering, processing and distribution of data, that is, it provides a starting point for "processing" the raw data of binlog. EventStore stores the data processed by EventSink.

CanalEventStore getEventStore ()

Event memory, that is, Canal Instance, as the "Slave" server of MySQL, needs to store the synchronized data, and then the client of Canal will eventually get the data from EventStore. At present, Canal only implements memory-based EventStore, so how to avoid memory leakage and how to avoid data loss will be the focus of follow-up research.

CanalMetaManager getMetaManager ()

The Canal metadata manager, for example, records the progress of consumption on the consumer side, that is, how data is processed from the Canal EventStore.

CanalAlarmHandler getAlarmHandler ()

Alarm service.

AbstractCanalInstance

The abstract implementation class of CanalInstance.

CanalInstanceWithManager

Based on the CanaInstance of manual programming, the CanalInstance instance is generated manually mainly by API. It can be compared to Spring's transaction manager based on programming API.

CanalInstanceWithSpring

Build CanaInstance based on Spring.

CanalInstanceGenerator

The construction class system of CanalInstance, that is, create CanalInstance instances through the methods provided by this class, and provide methods based on Spring, manual management, and so on.

2. Four core components of CanalInstance

It is not so intuitive to understand Canal Instance from the class level, then throw out a usage scenario, and then combine with the architecture diagram to further deepen the understanding of Canal Instance.

For example, the order system of a company uses sub-database and sub-table, and the database is deployed in 192.168.1.166 binlog 3306, and multiple schema is created on each database, such as order_db and user_db. Now, in order to provide multi-dimensional query, statistics and other functions for orders, the architecture group proposes to subscribe to the database binlog log to convert the order data in the two order databases. That is, the data in order_db is synchronized to elasticsearch, and Canal is designed to solve the above problems, so we can think about this scenario and push back the design concept of Canal Instance.

The architecture of Canal Instance is shown in the following figure:

The synchronization of data in Canal is the responsibility of the CanalInstance component, and multiple Canal Server instances can be created in one CanalInstance instance.

Each CanalInstance can be seen as a corresponding MySQL instance, that is, two database instances need to be synchronized in the case, so you eventually need to create two CanalInstance. In fact, it is not difficult to understand, because the binlog of MySQL is stored in an instance as a dimension. Canal Instance contains four core components: EventParse, EventSink, EventStore, and CanaMetaManager, which are mainly explained here, which will be described in detail in subsequent articles in order to better guide the practice.

EventParse component

Responsible for parsing binlog logs, its responsibility is to extract valid data according to the storage format of binlog, which is not difficult to understand. We can also learn more about the storage format of binglog through this module.

EventSink component

Combined with the data synchronization case, multiple Schema are usually created on a database instance, but not all schema usually need to be synchronized. If all the data parsed from EventParse is directly passed into the EventStore component, it will bring unnecessary performance consumption to EventStore. In addition, in this case, you need to synchronize the data of multiple databases to a single source, which may involve strategies such as merging and merging. The above and other requirements are the problem areas that EventSink needs to solve.

EventStore component

It is used to store data converted by canal and consumed by Canal Client. At present, Canal only provides memory-based storage implementation. You might as well think about how to use a memory-based storage mode to avoid memory overflow, and its implementation will be analyzed in detail in the following articles.

CanalMetaManager component

Metadata Store Manager. The most basic metadata in Canal should at least include the parsing site of the EventParse component and the consumption site on the consumer side. After Canal Server restart, you must be able to synchronize from the last unsynchronized location, otherwise data will be lost. In the example of synchronizing database data to es, the so-called canal client takes data from Canal Server, that is, EventStore, writes the data to es, and reports the progress of writing, all of which is done by the CanalMetaManager component.

From the latest version, Canal supports sending the parsed data directly to MQ, so CanalInstance also holds another component: CanalMQConfig, some configuration of MQ, provides a variety of strategies to achieve automatic mapping management of shcema, table to MQ Topic, bringing more convenience for Canal users. This part will be introduced separately in subsequent articles, which will not be discussed here for the time being.

After the above understanding, I think you have a relatively comprehensive understanding of CanalInstance, let's take a look at the construction of CanalInstance, which has a very important impact on the follow-up practice.

3. CanalInstance construction mode

There are two ways to initialize Instance in Canal: Spring and manual programming. The core of CanalInstance is the four components mentioned above, that is, the specific responsibility of the CanalInstanceWithManager class is to manage the above core components, that is, to provide loading, starting, stopping, and coordination of the above components, as can be seen from its name, as can be seen from its constructor:

Creating a CanalInstance programmatically is simple, as long as you set parameters and create a CanalInstanceWithManger method, as used in the sample code.

In addition, Canal provides the integration of Spring, bringing the relevant core components of canal Instance into the management of Spring. In fact, the current category is: CanalInstanceWithSpring. The corresponding Spring configuration example is shown below:

Warm Tip: the programming skills based on the secondary development of Canal are as follows: the Canal framework itself makes Canal Server into a startup script, which can be done by customizing Instance, that is, loading the configuration from the instance configuration file, then starting Canal Server to parse the Binlog log, and finally working according to the predetermined configuration, such as building some Canal clusters in the production environment and handing over the manual maintenance to the operation and maintenance staff. If data synchronization is needed, configure the corresponding instance file. In fact, this mode is in the initial stage of Canal, and a better way is to redevelop Canal and define data synchronization tasks through a visual interface, such as synchronizing the binglog logs of the specified Schema on the specified database instance to the specified topic of the specified message cluster, and can be pushed, stopped and restarted at any time, so that Canal maintainers do not have to pay attention to the underlying details. You just need to simply configure it through the page.

After the first article of the source code Canal series, several fans said that they are also studying Canal. As the author can only keep the weekly shift as far as possible, if you want to speed up the research of Canal, I have the following suggestions:

1. Make an in-depth study of its four core components, and study them with questions, such as how to ensure that the data is not lost when learning metadata management, and how to locate the site after restart.

2, if you want to study Canal as a whole, I think in addition to reading the official design manual of Canal, you can also take a special look at the class CanalParameter, all the configuration properties supported by Canal, and there are corresponding comments, all about Canal, you can peep out clues here, and then you can choose what you are interested in to continue in-depth study.

About Canal Instance design ideas and customized development ideas what is shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report