In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces "how to configure the source side of Oracle". In the daily operation, I believe many people have doubts about how to configure the source side of Oracle. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts about "how to configure the source side of Oracle". Next, please follow the editor to study!
It is a common data disposal method to export the structured data stored in Oracle to Hadoop system for offline computing. Recently, there is a scenario that requires real-time import from Oracle to Hadoop system. Here is a case study.
Oracle as a commercial database solution, it is difficult to obtain database transaction logs spontaneously, so we choose the official synchronization tool OGG (Oracle GoldenGate) to solve the problem.
Installation and basic configuration environment description
Software configuration
Role data storage service and version OGG version IP source server OracleRelease11.2.0.1Oracle GoldenGate 11.2.1.0 for Oracle on Linux x86-6410.0.0.25 target server Hadoop 2.7.2Oracle GoldenGate for Big Data 12.2.0.1 on Linux x86-6410.0.0.2
OGG on the source server is installed under the Oracle user, and OGG on the target server is installed under the root user.
Be careful
Oracle is exported to heterogeneous storage systems, such as MySQL,DB2,PG, and the corresponding Oracle GoldenGate versions are officially available on different platforms, such as AIX,Windows,Linux. You can download and install them here or in the old version.
Oracle source-side basic configuration
Put the downloaded OGG version in a convenient location and decompress it. The final decompression directory on the source end of this example Oracle is / u01/gg.
Configure environment variables
The environment variable here is mainly to add OGG-related environment variables to the user performing OGG. In this example, the environment variables added for Oracle users are as follows: (/ home/oracle/.bash_profile file)
Export OGG_HOME=/u01/gg/export LD_LIBRARY_PATH=$ORACLE_HOME/lib:$OGG_HOME:/lib:/usr/libexport CLASSPATH=$ORACLE_HOME/jdk/jre:$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib
Oracle turns on archive mode
Use the following command to see if it is currently in archive mode (archive)
SQL > archive log list Database log mode Archive ModeAutomatic archival EnabledArchive destination / u01/arch_logOldest online log sequence 6Next log sequence to archive 8Current log sequence 8
If it is not the above state, you can adjust it manually.
SQL > conn / as sysdba (connect to the database as DBA) SQL > shutdown immediate (close the database immediately) SQL > startup mount (start the instance and load the database but not open) SQL > alter database archivelog (change database to archive mode) SQL > alter database open (open database) SQL > alter system archivelog start (enable automatic archiving)
Oracle opens log related
OGG carries out real-time transmission based on auxiliary logs, etc., so you need to open relevant logs to ensure that transaction contents can be obtained. View the current status with the following command:
SQL > select force_logging, supplemental_log_data_min from vested database for SUPPLEME----YES YES
If the above query result is not YES, you can modify the status with the following command:
SQL > alter database force logging;SQL > alter database add supplemental log data
Oracle create replication user
In order to make the replication rights of users in Oracle more simple, we specially create replication users and grant dba permissions.
SQL > create tablespaceoggtbsdatafile'/ u01 size autoextend on;SQL > create user ggs identified by ggs default tablespaceoggtbs;User created.SQL > grant dba to ggs;Grant succeeded.
Finally, the permissions of this ggs account are as follows:
SQL > select * from dba_sys_privs where GRANTEE='GGS' GRANTEE PRIVILEGE ADMGGS DROP ANY DIRECTORY NOGGS ALTER ANY TABLE NOGGS ALTER SESSION NOGGS SELECT ANY DICTIONARY NOGGS CREATE ANY DIRECTORY NOGGS RESTRICTED SESSION NOGGS FLASHBACK ANY TABLE NOGGS UPDATE ANY TABLE NOGGS DELETE ANY TABLE NOGGS CREATE TABLE NOGGS INSERT ANY TABLE NOGRANTEE PRIVILEGE ADMGGS UNLIMITED TABLESPACE NOGGS CREATE SESSION NOGGS SELECT ANY TABLE NO
OGG initialization
Go to the home directory of OGG to execute. / ggsci, and enter the OGG command line
[oracle@VM_0_25_centos gg] $. / ggsci Oracle GoldenGate Command Interpreter for OracleVersion 11.2.1.0.3 14400833 OGGCORE_11.2.1.0.3_PLATFORMS_120823.1258_FBOLinux, x64, 64bit (optimized), Oracle 11g on Aug 23 2012 20:20:21Copyright (C) 1995, 2012, Oracle and/or its affiliates. All rights reserved.GGSCI (VM_0_25_centos) 1 > execute create subdirs to create directory GGSCI (VM_0_25_centos) 4 > create subdirsCreating subdirectories under current directory / u01/ggParameter files / u01/gg/dirprm: already existsReport files / u01/gg/dirrpt: already existsCheckpoint files / u01/gg/dirchk: already existsProcess status files / u01/gg/dirpcs: already existsSQL script files / u01/gg/dirsql: already existsDatabase definitions files / u01/gg/dirdef: already existsExtract data files / u01/gg/dirdat: already existsTemporary files / u01/gg/dirtmp: already existsStdout files / u01/gg/dirout: already exists
Oracle creates simulated replication library tables
Simulate a user named tcloud, password tcloud, and create a table based on this user, called t_ogg.
SQL > create user tcloud identified by tcloud default tablespace users;User created.SQL > grant dba to tcloud;Grant succeeded.SQL > conn tcloud/tcloud;Connected.SQL > create table t_ogg (id int, text_name varchar (20), primary key (id)); Table created.
Target side basic configuration
Put the downloaded OGG version in a convenient location and decompress it. The final decompression directory for the target end of this example Oracle is / data/gg.
Configure environment variables
HDFS-related libraries need to be used here, so you need to configure java environment variables and OGG-related libraries, and introduce HDFS-related library files. Refer to the configuration as follows:
Export JAVA_HOME=/usr/java/jdk1.7.0_75/export LD_LIBRARY_PATH=/usr/java/jdk1.7.0_75/jre/lib/amd64:/usr/java/jdk1.7.0_75/jre/lib/amd64/server:/usr/java/jdk1.7.0_75/jre/lib/amd64/libjsig.so:/usr/java/jdk1.7.0_75/jre/lib/amd64/server/libjvm.so:$OGG_HOME:/libexport OGG_HOME=/data/gg
OGG initialization
The OGG initialization on the destination side and the source side are similar to entering the home directory of OGG to execute. / ggsci, enter the OGG command line
GGSCI (10.0.0.2) 2 > create subdirsCreating subdirectories under current directory / data/ggParameter files / data/gg/dirprm: already existsReport files / data/gg/dirrpt: already existsCheckpoint files / data/gg/dirchk: already existsProcess status files / data/gg/dirpcs: already existsSQL script files / data/gg/dirsql: already existsDatabase definitions files / data/gg/dirdef: Already existsExtract data files / data/gg/dirdat: already existsTemporary files / data/gg/dirtmp: already existsCredential store files / data/gg/dircrd: already existsMasterkey wallet files / data/gg/dirwlt: already existsDump files / data/gg/dirdmp: already exists
Oracle source configuration
The basic principle of real-time transmission of Oracle to Hadoop clusters (HDFS,Hive,Kafka, etc.) is shown in the figure:
According to the above principle, configuration is roughly divided into the following steps: the source end configures the ogg manager (mgr) on the destination side; the source side configures the extract process for Oracle log crawling; the source side configures the pump process to transfer crawled content to the destination side; the destination side configures the replicate process to copy logs to the Hadoop cluster or copies to a user-defined parser to drop the final result into the Hadoop cluster.
Configure global variables
Under the source server OGG home directory, execute. / ggsci to the OGG command line, and execute the following command:
GGSCI (VM_0_25_centos) 1 > dblogin userid ggs password ggsSuccessfully logged into database.GGSCI (VM_0_25_centos) 3 > view params. / globalsggschema ggs
If the. / globals variable is not available, you can use edit params. / globals to edit and add it (vim used by default by the editor)
Configuration Manager mgr
Execute the following command from the OGG command line:
GGSCI (VM_0_25_centos) 4 > edit param mgrPORT 7809DYNAMICPORTLIST 7810-7909AUTORESTART EXTRACT *, RETRIES 5 WAITMINUTES 3PURGEOLDEXTRACTS. / dirdat/*,usecheckpoints, minkeepdays 3
Description: PORT is the default listening port of mgr; DYNAMICPORTLIST dynamic port list. When the specified mgr port is not available, one of the ports will be selected, and the maximum specified range is 256.The setting of AUTORESTART restart parameter indicates that all EXTRACT processes are restarted, up to 5 times, each time with an interval of 3 minutes. PURGEOLDEXTRACTS is the regular cleaning of TRAIL files.
Execute start mgr under the command line to start the management process, and view the mgr status through info mgr
GGSCI (VM_0_25_centos) 5 > info mgrManager is running (IP port VM_0_25_centos.7809). Add replication tabl
Add the table that needs to be replicated under the OGG command line, as follows:
GGSCI (VM_0_25_centos) 7 > add trandata tcloud.t_oggLogging of supplemental redo data enabled for table TCLOUD.T_OGG.GGSCI (VM_0_25_centos) 8 > info trandata tcloud.t_oggLogging of supplemental redo log data is enabled for table TCLOUD.T_OGG.Columns supplementally logged for table TCLOUD.T_OGG: ID. Configure the extract process
Configure the extract process to execute the following command from the OGG command line:
GGSCI (VM_0_25_centos) 10 > edit params ext2hdextract ext2hddynamicresolutionSETENV (ORACLE_SID = "orcl") SETENV (NLS_LANG = "american_america.AL32UTF8") userid ggs,password ggsexttrail / u01/gg/dirdat/tctable tcloud.t_ogg
Note: the first line specifies the extract process name; dynamicresolution dynamic parsing; SETENV sets environment variables, where the Oracle database and character set are set respectively; userid ggs,password ggs is the account password for OGG to connect to the Oracle database, and the replication account specially created in 2.3.4 is used here; exttrail defines the location and file name of the trail file. Note that the file name can only be 2 letters here, and the rest of the OGG will be completed. Table is the indication of a replicated table that supports * wildcards and must end with;
Next, from the OGG command line, execute the following command to add the extract process:
GGSCI (VM_0_25_centos) 11 > add extract ext2hd,tranlog,begin nowEXTRACT added.
Finally, add the definition of the trail file to bind to the extract process:
GGSCI (VM_0_25_centos) 12 > add exttrail / u01GG dirdatmax TC extract ext2hdEXTTRAIL added
You can view the status through the info command from the OGG command line:
GGSCI (VM_0_25_centos) 14 > info ext2hdEXTRACT EXT2HD Initialized 2016-11-09 15:37 Status STOPPEDCheckpoint Lag 00:00:00 (updated 00:02:32 ago) Log Read Checkpoint Oracle Redo Logs 2016-11-09 15:37:14 Seqno 0, RBA 0 SCN 0 (0) configure the pump process
The pump process is essentially an extract, but its function is only to pass the trail file to the target side. The configuration process is similar to the extract process, but logically called the pump process.
Execute under the OGG command line:
GGSCI (VM_0_25_centos) 16 > edit params push3hdextract push3hdpassthrudynamicresolutionuserid ggs,password ggsrmthost 10.0.0.2 mgrport 7809rmttrail / data/gg/dirdat/tctable tcloud.t_ogg
Note: the first line specifies the name of the extract process; passthru forbids the interaction between OGG and Oracle, and we use pump logic to transmit here, so it is forbidden; dynamicresolution dynamic parsing; userid ggs,password ggs, the account password for OGG to connect to the Oracle database, here uses the replication account specially created in 2.3.4; rmthost and mgrhost are the address and listening port of the mgr service of the target side OGG; rmttrail is the storage location and name of the target side trail files
Bind the local trail file and the target side trail file to the extract process, respectively:
GGSCI (VM_0_25_centos) 17 > add extract push3hd,exttrailsource / u01/gg/dirdat/tcEXTRACT added.GGSCI (VM_0_25_centos) 18 > add rmttrail / data/gg/dirdat/tc,extract push3hdRMTTRAIL added.
You can also use info to view the process status from the OGG command line:
GGSCI (VM_0_25_centos) 19 > info push3hdEXTRACT PUSH2HD Initialized 2016-11-09 15:52 Status STOPPEDCheckpoint Lag 00:00:00 (updated 00:01:04 ago) Log Read Checkpoint File / u01/gg/dirdat/tc000000 First Record RBA 0 configuration define file
Data transfer between Oracle and MySQL,Hadoop clusters (HDFS,Hive,kafka, etc.) can be defined as the transfer of heterogeneous data types, so the relational mapping between tables needs to be defined and executed on the OGG command line:
GGSCI (VM_0_25_centos) 20 > edit params tclouddefsfile / u01/gg/dirdef/tcloud.t_ogguserid ggs,password ggstable tcloud.t_ogg
Execute under the OGG home directory:
. / defgen paramfile dirprm/tcloud.prm
When you are finished, you will generate such a file / u01/gg/dirdef/tcloud.t_ogg, which can be copied to the dirdef directory under the OGG home directory on the target side.
The configuration on the target side creates the target table (directory)
Here, when the target end is a HDFS directory or Hive table or MySQL database, you need to manually create a directory or table on the target side. The creation method is similar. Here, we simulate real-time input to the HDFS directory, so you can manually create a receive directory.
Hadoop-fs mkdir / gg/replication/hive/
Configuration Manager mgr
The configuration of the OGG manager (mgr) on the destination side is similar to that on the source side, which is executed under the OGG command line:
GGSCI (10.0.0.2) 2 > edit params mgrPORT 7809DYNAMICPORTLIST 7810-7909AUTORESTART EXTRACT *, RETRIES 5 Magi WAITMINUTES 3PURGEOLDEXTRACTS. / dirdat/*,usecheckpoints, minkeepdays 3 configuration checkpoint
Checkpoint replicates a traceable offset record, and you can add checkpoint table to the global configuration.
GGSCI (10.0.0.2) 5 > edit params. / GLOBALSCHECKPOINTTABLE tcloud.checkpoint
Just save it.
Configure the replicate process
Execute under the command line of OGG:
GGSCI (10.0.0.2) 8 > edit params r2hdfsREPLICAT r2hdfssourcedefs / data/gg/dirdef/tcloud.t_oggTARGETDB LIBFILE libggjava.so SET property=dirprm/hdfs.propsREPORTCOUNT EVERY 1 MINUTES, RATE GROUPTRANSOPS 10000MAP tcloud.t_ogg, TARGET tcloud.t_ogg
Description: REPLICATE r2hdfs defines the rep process name; sourcedefs is the table mapping file done on the source server in 3.6.The TARGETDB LIBFILE defines some adaptive library files and configuration files of HDFS, and the dirprm/hdfs.props;REPORTCOUNT configuration file under the OGG home directory is the report generation frequency of the replication task; GROUPTRANSOPS is the unit of transaction transfer, which reduces IO operations; MAP is the mapping relationship between the source side and the target side.
In the configuration of property=dirprm/hdfs.props, the main configurations and comments are as follows:
In gg.handlerlist=hdfs / / OGG for Big Data, the handle type gg.handler.hdfs.type=hdfs / / OGG for Big Data in HDFS destination gg.handler.hdfs.rootFilePath=/gg/replication/hive/ OGG for Big Data HDFS storage home directory gg.handler.hdfs.mode=op / / OGG for Big Data transfer mode, that is, op is transferred once at a time SQL Tx is a transaction transfer once file transfer format gg.classpath=/usr/hdp/2.2.0.0-2041/hadoop/share/hadoop/common/*:/usr/hdp/2.2.0.0-2041/hadoop/share/hadoop/common/lib/*:/usr/hdp/2.2.0.0-2041/hadoop/share/hadoop/hdfs/*:/usr/hdp/2.2.0 in gg.handler.hdfs.format=delimitedtext / / OGG for Big Data Definition of the HDFS library used in .0-2041/hadoop/etc/hadoop/:/data/gg/:/data/gg/lib/*:/usr/hdp/2.2.0.0-2041 OGG for Big Data Hadooppact client * /
Specific OGG for Big Data support parameters and definition of referenced address
Finally, execute it under the command line of OGG:
GGSCI (10.0.0.2) 9 > add replicat r2hdfs exttrail / data/gg/dirdat/tc,checkpointtable tcloud.checkpointtabREPLICAT added.
Just bind the file to the replication process
Test the startup process
Start all processes in the form of start [process name] under the OGG command line on the source side and the destination side.
The startup sequence is done according to the source mgr-- destination mgr-- source extract-- source pump-- destination replicate.
Check process status
After the above startup is completed, you can use info [process name] to view the status of all processes under the OGG command line on the source side and destination side, as follows:
Source side:
GGSCI (VM_0_25_centos) 7 > info mgrManager is running (IP port VM_0_25_centos.7809). GGSCI (VM_0_25_centos) 9 > info ext2hdEXTRACT EXT2HD Last Started 2016-11-09 16:05 Status RUNNINGCheckpoint Lag 00:00:00 (updated 00:00:09 ago) Log Read Checkpoint Oracle Redo Logs 2016-11-09 16:45:51 Seqno 8 RBA 132864000 SCN 0.1452333 (1452333) GGSCI (VM_0_25_centos) 10 > info push3hdEXTRACT PUSH2HD Last Started 2016-11-09 16:05 Status RUNNINGCheckpoint Lag 00:00:00 (updated 00:00:01 ago) Log Read Checkpoint File / u01/gg/dirdat/tc000000 First Record RBA 1043
Destination side:
GGSCI (10.0.0.2) 13 > info mgr Manager is running (IP port 10.0.0.2.7809, Process ID 8242). GGSCI (10.0.0.2) 14 > info r2hdfsREPLICAT R2HDFS Last Started 2016-11-09 16:45 Status RUNNINGCheckpoint Lag 00:00:00 (updated 00:00:02 ago) Process ID 4733Log Read Checkpoint File / data/gg/dirdat/tc000000 First Record RBA 0
All the states are RUNNING. (of course, you can also use info all to view the status of all processes)
Test the effect of synchronous updates
The test method is relatively simple, which can be directly operated by insert,update,delete in the data table of the source side. Because the synchronization from Oracle to Hadoop cluster is heterogeneous, truncate operation is not supported yet.
The source side performs insert operation
SQL > conn tcloud/tcloudConnected.SQL > select * from tweeogs, no rows selectedSQL > desc tweeogs; Name Null? Type-ID NOT NULL NUMBER (38) TEXT_NAME VARCHAR2 (20) SQL > insert into t_ogg values (1 'test') 1 row created.SQL > commit;Commit complete.
View the status of source-side trail files
[oracle@VM_0_25_centos dirdat] $ls-l / u01GG _ dirdata _
Check the status of the destination trail file
[root@10 dirdat] # ls-l / data/gg/dirdat/tc*-rw-r- 1 root root 1217 Nov 9 17:05 / data/gg/dirdat/tc000000
Check to see if there are any writes in HDFS
Hadoop fs-ls / gg/replication/hive/tcloud.t_ogg-rw-rw-r-- 3 root hdfs 110 2016-11-09 17:05/gg/replication/hive/tcloud.t_ogg/tcloud.t_ogg_2016-11-09 September 17-05-30.514.txt
Note: judging from the contents of the file written to HDFS, the format of the file is as follows:
ITCLOUD.T_OGG2016-11-09 09Frey 05VOV 25.0670822016-11-09T17:05:30.51200000000000000000001080ID1TEXT_NAMEtest
It is obvious that Oracle's data has been imported into HDFS in quasi-real time. The imported content is actually a piece of similar flow logs (the specific log format is different from the transmission format, the content is slightly different, the delimitedtext used in this example. The format is operator database. Table name operation timestamp (GMT+0) current timestamp (GMT+8) offset field 1 name field 1 content field 2 name field 2 content). If you want to be completely consistent with the table content of Oracle, you need to manually implement the function of parsing the log and writing it to Hive. The adapter is not officially provided here. At present, Tencent side has realized the development of this function.
Of course, you can directly use the path of this HDFS to create an external table on the Hive through LOCATION to achieve the purpose of importing it into Hive in real time.
Summary
OGG for Big Data implements the interface of real-time synchronization of Oracle to Hadoop system, but the resulting logs still need to be parsed at the application layer (the corresponding version of OGG for relational databases such as MySQL has achieved application layer parsing without manual parsing).
The mgr,extract,pump,replicate configuration of several main processes of OGG is convenient, and the real-time synchronization between OGG and heterogeneous relational storage structure can be configured quickly. If a new table is added later, you can modify the corresponding extract,pump and replicate processes. Of course, if it is an entire library, you can use the wildcard method when configuring the above two processes.
Appendix
For real-time synchronization from OGG to Hadoop, you can add replicate processes and synchronization targets directly to the destination side when the configuration of extract and pump processes on the source side remains unchanged. The following is a brief description of the configuration method of adding synchronization to Kafka in this example.
In this example, the extract,pump processes are ready-made and do not need to be added. You only need to add the replicate process synchronized to Kafka on the target side.
Execute under the command line of OGG:
GGSCI (10.0.0.2) 4 > edit params r2kafkaREPLICAT r2kafkasourcedefs / data/gg/dirdef/tcloud.t_oggTARGETDB LIBFILE libggjava.so SET property=dirprm/r2kafka.propsREPORTCOUNT EVERY 1 MINUTES, RATEGROUPTRANSOPS 10000MAP tcloud.t_ogg, TARGET tcloud.t_ogg
The replicate process is similar to the configuration imported into HDFS, except that a different configuration dirprm/r2kafka.props is called. The main configuration of this configuration is as follows:
Gg.handlerlist = kafkahandler / / handler type gg.handler.kafkahandler.type = kafkagg.handler.kafkahandler.KafkaProducerConfigFile=custom_kafka_producer.properties / / kafka related configuration gg.handler.kafkahandler.TopicName = topic name of ggtopic / / kafka. There is no need to manually create the format of gg.handler.kafkahandler.format = json / / transfer files. It supports the transfer mode in gg.handler.kafkahandler.mode = op / / OGG for Big Data, such as json,xml, that is, op is transferred once per SQL. Tx transfers a reference to the related library file for one transaction, gg.classpath=dirprm/:/usr/hdp/2.2.0.0-2041 qfkaqxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxf / / related library files.
The custom_kafka_producer.properties referenced by r2kafka.props defines the relevant configuration of Kafka as follows:
Address of bootstrap.servers=10.0.0.62:6667 / / kafkabroker acks=1compression.type=gzip / / compression type reconnect.backoff.ms=1000 / / reconnection delay value.serializer=org.apache.kafka.common.serialization.ByteArraySerializerkey.serializer=org.apache.kafka.common.serialization.ByteArraySerializerbatch.size=102400linger.ms=10000
For the above configuration and other configurable items, please refer to the address:
After the above configuration is completed, add the trail file to the replicate process under the OGG command line and start the replicate process that is imported into Kafka
GGSCI (10.0.0.2) 5 > add replicat r2kafka exttrail/data/gg/dirdat/tc Checkpointtable tcloud.checkpointREPLICAT added.GGSCI (10.0.0.2) 6 > start r2kafkaSending START request to MANAGER... REPLICAT R2KAFKA startingGGSCI (10.0.0.2) 10 > info r2kafkaREPLICAT R2KAFKA Last Started 2016-11-09 17:59 Status RUNNINGCheckpoint Lag 00:00:00 (updated 00:00:09 ago) Process ID 5236Log Read Checkpoint File / data/gg/dirdat/tc000000 2016-11-09 17 REPLICAT R2KAFKA startingGGSCI 25.067082 RBA 1217
Check the effect of real-time synchronization to kafka. While updating the table at the source end of Oracle, use the script provided by the kafka client to check the messages under the kafkatopic of ggtopic configured here:
SQL > insert into t_ogg values (2 row created.SQL > commit;Commit complete).
Synchronization of the destination Kafka:
[root@10 kafka] # bin/kafka-console-consumer.sh-- zookeeper 10.0.0.223 topic ggtopic 2181-- from-beginning-- topic ggtopic {"table": "TCLOUD.T_OGG", "op_type": "I", "op_ts": "2016-11-0909 zookeeper 05VOV 25.067082", "current_ts": "2016-11-09T17:59:20.943000", "pos": "0000000000001080", "after": {"ID": "1" "TEXT_NAME": "test"} {"table": "TCLOUD.T_OGG", "op_type": "I", "op_ts": "2016-11-09 10 table 02VR 06.827204", "current_ts": "2016-11-09T18:02:12.323000", "pos": "00000000001217", "after": {"ID": "2", "TEXT_NAME": "test2"}}
Obviously, the data of Oracle has been synchronized to Kafka in quasi-real time. The synchronization information before the discovery of this topic also exists from scratch. Architecturally, you can directly consume kafka messages such as Storm,SparkStreaming for business logic processing.
From Oracle real-time synchronization to other Hadoop clusters, HDFS,HBase,Flume and Kafka are provided in the latest official version. For relevant configurations, please refer to the example configuration given on the official website.
At this point, the study on "how to configure the source side of Oracle" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.