In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
How to parse the source code of the instance module, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain in detail for you, people with this need can come to learn, I hope you can gain something.
1. Basic structure
The instance module is also divided into three sub-modules, core, manager and spring.
Among them, core is the core logic of instance.
Manager and spring are just two different ways to read instance configuration. Manager reads admin configuration through http request, and spring reads admin configuration through configuration file.
The main control logic we mentioned in the deployer module source code analysis is in the instanceGenerator of the CanalController class, and the configuration parameter is canal.instance.global.mode.
Create config from destination if canal.instance.global.mode = manager, use PlainCanalInstanceGenerator if canal.instance.global.mode = spring, use SpringCanalInstanceGenerator
The source code is as follows
2.core sub-module
There is not much code, just two interfaces and two classes.
2.1 CanalInstanceGenerator interface
There is only one method for this interface
The specific implementation is the two mentioned at the beginning, PlainCanalInstanceGenerator and SpringCanalInstanceGenerator, which are implemented in the manager sub-module and spring sub-module, respectively.
The specific choice is to choose according to the canal.instance.global.mode in the canalController at the beginning.
2.2 CanalInstance interface
First look at a picture of an official document, which has been analyzed in the previous article.
Server represents a canal-server running instance, corresponding to a jvm. There can be multiple instance inside a server.
There are four main components within Instance:
EventParser: data source access, analog slave protocol and master interaction, protocol parsing eventSink: Parser and Store connectors, data filtering, processing, distribution work eventStore: data storage metaManager: incremental subscription & consumption information manager
In this interface, you define the method to get four components, as well as the configuration information of the new version of mqProducer (mqProducer was introduced in the server module parsing, you can take a look back)
Let's take a brief look at the various implementation classes of the four component interfaces.
CanalEventParser API implementation class (paser module):
MysqlEventParser: slave parsing binglog log disguised as a single mysql instance GroupEventParser: slave parsing binglog log disguised as multiple mysql instances. Multiple CanalEventParser are maintained internally, multiple EventParser are combined for merging, and group is only processed as a delegate. The main application scenario is sub-database and sub-table: for example, a large table splits four libraries on different mysql instances. Normally, we need to configure four CanalInstance. Accordingly, when you want to consume data in business, you need to start 4 clients and link 4 instance instances respectively. To facilitate business use, we can let CanalInstance reference a GroupEventParser. Four MysqlEventParser are maintained internally by GroupEventParser to pull binlog from four different mysql instances, and finally merged together. At this point, the business only needs to launch a client, and link to this CanalInstance to LocalBinlogEventParser: parse the local mysql binlog. For example, copy the binlog file of mysql to the machine of canal for parsing. RdsLocalBinlogEventParser: copy the binlog backup file based on Aliyun rds, download it locally and parse the binlog locally.
CanalEventSink API implementation class (sink module):
EntryEventSink: ordinary sink operation of a single parser for data filtering, processing, and distribution
GroupEventSink: used for the scenario of sub-database and sub-table, corresponding to the data parsing of GroupEventParser, and then implementing sink processing based on merge and sorting
CanalEventStore API implementation class (store module):
MemoryEventStoreWithBuffer: storage store based on memory
CanalMetaManager (meta module):
ZooKeeperMetaManager: storing metadata in zk
MemoryMetaManager: storing metadata in memory
MixedMetaManager: the usage mode of combining memory and zookeeper
PeriodMixedMetaManager: mixed implementation based on timing refresh strategy
FileMixedMetaManager: write memory first, and then refresh the data to File regularly
The specific details of these implementations are explained in the source code analysis of the corresponding modules. For now, all you need to know is that some components have multiple implementations, so combinations work in a variety of ways.
2.3 AbstractCanalInstance Class
AbstractCanalInstance is an abstract class of canalInstance that maintains references to related components.
At the same time, the subscribeChange method of canalInstance interface is also implemented.
This abstract class has two implementations, CanalInstanceWithManager and CanalInstanceWithSpring.
The initialization process of AbstractCanalInstance is done in the implementation class.
If you choose the admin control mode, it is done in CanalInstanceWithManager; if it is in spring mode, it is done in CanalInstanceWithSpring.
Here's a little discovery:
Taking a closer look at the actual code calls, we find that CanalInstanceWithManager is for ManagerCanalInstanceGenerator, and this generator is not actually used. If you use the admin pattern, as we saw at the beginning of this article, we used PlainCanalInstanceGenerator. The implementation of the generate method in PlainCanalInstanceGenerator is actually similar to SpringCanalInstanceGenerator. Just pull from the remote admin to the configuration, replace the system variable, and then build the concrete instance from the beanfactory of spring.
2.3.1 subscribeChange () method
The AbstractCanalInstance class implements the subscribeChange method of the CanalInstance interface.
We see that if the subscription relationship changes, do something. If you look at it here, the main thing is to update the filter.
Filter specifies which libraries and tables you need to subscribe to.
2.3.2 start () method
There is no special logic to start, just start each component in sequence.
The order is metaManager-> alarmHandler-> eventStore-> eventSink-> eventParser.
The startup sequence is mainly related to dependencies, and the management of meta-information is related to everything, so metaManager starts first, and others start one by one according to the relationship between each other.
Here we find that special handling is done when starting eventParser, namely beforeStartEventParser and afterStartEventParser. Let's talk about it specifically in 2.3.4.
2.3.3 stop () method
There is nothing special about stop, except to turn off each component in turn.
The order of shutting down is the reverse process of start.
No code will be posted here.
2.3.4 Special treatment of eventParserr
There is special handling before and after eventParser in both start and stop methods, beforeStartEventParser for start and beforeStopEventParser and afterStopEventParser for afterStartEventParser,Stop.
This actually has something to do with the design of eventParser.
EventParser design
Each EventParser is associated with two internal components
CanalLogPositionManager: record the location information of the last successful parsing of binlog, mainly describing the location of the next canal startup CanalHAController: control the linked host management of EventParser, and determine which mysql database to link currently.
Therefore, these two beforexxx and afterxxxx methods mainly do the start and stop work of CanalLogPositionManager and CanalHAController.
2.3.5 AbstractCanalInstance Class Summary
You can see that AbstractCanalInstance does nothing but start and stop its internal components.
After eventParser starts in AbstractCanalInstance, it automatically starts the multithreaded task dump data and delivers it to eventStore through eventSink.
The operation logic of eventStore is actually done in CanalServerWithEmbedded, so we can review the relevant logic of getWithoutAck () in CanalServerWithEmbedded.
These include:
Obtain the corresponding instance according to the destination of clientIdentity
Get the location positionRanges of the last batch of streaming data (associated with batchId, which is in the map above)
Get binlog from cananlEventStore and convert it to event. It usually starts at the last batchId location, and if there is no batchId before, it starts at the consumption point recorded by cursor; if cursor is empty, you can only start with the first message from eventStore. (think again about several positional relationships here, related to ack, draw a picture.)
Event is converted to entry, and a new batchId is generated, which is combined into message and returned to the client.
So, in fact, this is just a simple start and stop, and the interaction logic of the components is implemented by get the various components of the instance in CanalServerWithEmbedded.
3.spring module
As mentioned earlier, the implementation of the generate method in PlainCanalInstanceGenerator is actually similar to SpringCanalInstanceGenerator. Just pull from the remote admin to the configuration, replace the system variable, and then build the concrete instance from the beanfactory of spring.
So let's focus on the configuration of the spring sub-module.
Just the following four classes
3.1 CanalInstanceWithSpring Class
Start the canal instance based on the spring container, which is convenient to start independently of manager.
Inherit AbstractCanalInstance, is actually a series of components of the setter method, do not paste the source code.
The specific configuration is based on spring's xml.
When we configure the loading method to be spring, all the CanalInstance instances created are of type CanalInstanceWithSpring. Canal will look for a local spring configuration file to create an instance instance. Canal provides the following spring configuration files by default:
Spring/memory-instance.xmlspring/file-instance.xmlspring/default-instance.xmlspring/group-instance.xml
CanalInstanceWithSpring is configured in the same way in all four configuration files:
Of course, the ref of each component varies from one configuration file to another.
The most important thing is that the two configurations of metaManager and eventParser are different and may be stored in memory, files, or zk.
The definitions of eventStore, and eventSink are the same. In the current open source version of eventStore, eventStore has only one memory-based implementation. EventSink acts as a connector for eventParser and eventStore, for data filtering, processing, and distribution. No storage is involved, so there is no need to distinguish between memory, file, or zk.
3.2 SpringCanalInstanceGenerator classes
This is the logic that specifically creates the instance.
By the way, take a look at the implementation in PlainCanalInstanceGenerator, which is to pull the configuration from the remote end, then replace the variables with PropertyPlaceholderConfigurer, and then use beanFactory to get the instance.
Com.alibaba.otter.canal.instance.spring.support.PropertyPlaceholderConfigurer inherits org.springframework.beans.factory.config.PropertyPlaceholderConfigurer, sets dynamic properties, and replaces local properties.
In fact, there are few things in this module, and there is no particularly complex logic.
Let's review a few questions.
What are the instance configuration modes and how to create an instance based on the configuration?
There are mainly two ways: based on spring and based on remote configuration, the former is implemented in PlainCanalInstanceGenerator, and the latter is implemented in PlainCanalInstanceGenerator.
How does the remote configuration override the local configuration?
Spring's PropertyPlaceholderConfigurer is used in PlainCanalInstanceGenerator to override the configuration
What are the internal components of the instance instance?
Including parser, sink, store, metamanager and other components, but only responsible for the start and stop logic, the specific interaction logic is implemented in CanalServerWithEmbedded.
Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.