In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces POSTGRESQL logical replication and CDC capture how to build a real-time data analysis platform, the article is very detailed, has a certain reference value, interested friends must read it!
Most databases provide the function of CDC. Students of change data capture and DB may ask why they need this function. ORACLE has DG data replication, SQL SERVER has replication, MYSQL has binlog replication, PG logic, physical replication, CDC is redundant.
The answer is NO. CDC is a way to track database operations, and it is also a way similar to getting changes in the database. More often, some subsequent triggering or judgmental operations can be carried out after some data are obtained.
More importantly, CDC will make fewer changes to the system and will not greatly affect the performance of the database.
In fact, there are other ways to synchronize data. For example, the binlog and trigger of some databases can capture and record the data of the database.
The method chosen here is the logical copy of POSTGRESQL + the audit-trigger of 2nd.
PostgreSQL itself supports physical replication as well as logical replication. Here, the DML operation of the table is described by means of logical replication to carry out a visual operation and extractable operation.
First of all, you need to set up the configuration of the POSTGRESQL that needs to be CDC.
1 logical replication of POSTGRESQL needs to be turned on
2 after modification, we need to restart the server
Create a publication for the table to be replicated
Create publication cdc for table test1
At the same time, you need to create an account with the permission to read the test1 table, and also have the permission to replication. Of course, if you want to be lazy, SUPERUSER can be used as an option when testing.
Then you need to create the library and table structure corresponding to the master library in the "slave library" that accepts the data.
CREATE SUBSCRIPTION cdc CONNECTION 'dbname=test host=192.168.198.100 user=admin password=1234.com port=5432' PUBLICATION cdc
Create the relevant subscription in the database that receives the information.
You can see that the data has gone from data publication to subscription.
It should be noted that the structure of the tables of publication and subscription must be consistent, otherwise there will be problems.
Later, you need to install the 2ndQuadrant audit component, download it and install it on the data receiver.
Problems may occur during installation. The following figure shows that hstore is not installed, so there is a problem with installing audit.sql
After executing\ I / home/postgres/audit.sql
After installation, delete a piece of data directly from the main library
Then check whether there is this record in the relevant audit.logged_actions directly from the library.
Basically, we can obtain the data changes of a table in PG through such data records and other programs, and assist the self-developed program to capture the data in this table. It provides a way to synchronize the data to other databases.
In fact, the main application here is the logical replication of PG, which is more flexible than the physical replication of PG, such as data aggregation.
In fact, with a little use, this method can become an overall data aggregation and data distribution platform based on PG database.
First of all, through the logical replication of PG itself, the tables that need to be analyzed in multiple database systems (usually databases of different business systems) are logically copied to the machine converged by PG, and then the data of CDC is generated and stored arbitrarily to other databases or big data platforms by secondary development of the program. This benefit is also obvious, because now when ETL extracts data Most ETL support software (real-time data) software is not too much and basically no free, this can basically support the needs of a business real-time data analysis. Compared with other databases, it is easier and cheaper to build and complete this task with PG as a whole. And the overall structure is not very complicated.
These are all the contents of the article "POSTGRESQL logical replication and CDC capture how to build a real-time data analysis platform". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
File name: checkbox.htmlCheckbox Checkbox < / h4 >
© 2024 shulou.com SLNews company. All rights reserved.