In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >
Share
Shulou(Shulou.com)05/31 Report--
What are the INFORMATICA development specifications? I believe many inexperienced people are at a loss about this. Therefore, this article summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.
Definition of responsible person for ETL R & D Operation and maintenance
1) the transfer of business data within the information department shall adopt the principle of user research and development, that is, who will access the data and who will be responsible for the research and development of the ETL version. The operation and maintenance personnel of the target system carry out operation and maintenance.
2) for the independent management system of business users, there is no special research and development, and the data fetching source R & D is responsible for the research and development of informatica version push data. The source-end system is responsible for the corresponding workflow operation and maintenance personnel.
3 operational constraints of ETL R & D database
1) for incrementally synchronized tables, primary keys are required for the source and target tables.
2) an index is required on the incremental field
3) incremental fields (such as time conditions, must enter the database in strict order, or strictly ensure that no data will be entered in the fetching range of incremental synchronization after the completion of incremental synchronization)
Synchronize the scene
Synchronization requirement
Synchronization type
Whether there is a primary key or unique key in the source and target tables
Whether there is an index on the incremental field (source table and target table)
Update synchronization
Delte synchronization
Identified field increment
There is competition or index
An index needs to exist.
The updatetime field needs to exist and update field contention is required for update update synchronization.
The deleted data needs to be written to the temporary table. After synchronizing the temporary table at the same time, the etl deletes the corresponding data in the target database, and the source side also needs to clean the temporary table regularly.
Full read synchronization of source table
There are no constraints or indexes
N/A
N/A
N/A
Full read synchronization of source table
There is competition or index
N/A
N/A
N/A
4 definition
Serial number
Terminology or acronym
Illustrative definition
one
ETL
Extraction-Transformation-Loading, data loading
two
Source
Source
three
Target
target
four
Transformation
module
five
Mapping
Data mapping
six
Mapplet
Data mapping set, reusable Transformation combination
seven
Session
Perform a task
eight
Worklet
Data working set
nine
Workflow
Data workflow
ten
Schedule
Dispatching frequency
eleven
Parameter
Parameters.
twelve
ETLUser
Use database users synchronized with ETL data
thirteen
ProductDatabaseSID
Production system database SID
5 range
Readers of this document include:
L Project Manager
L system administrator
L DBA administrator
L developer
L tester
L operation and maintenance personnel
The technologies to be used in this project:
L ETL data integration and transformation: Informatica
L operating system: Linux, Windows
L database: Oracle, Mysql, DB2, MS SQLServer, etc.
6 system general properties
Service Variable
Description
Value
one
$PMRootDir
Infa_share root directory
\ server\ infa_shared
two
$PMSessionLogDir
Session run log directory
$PMRootDir/SessLogs.
three
$PMBadFileDir
Reject files rejects file directory
$PMRootDir/BadFiles.
four
$PMCacheDir
Temporary cache files
$PMRootDir/Cache
five
$PMTargetFileDir
Target files target file generation directory
$PMRootDir/TgtFiles
six
$PMSourceFileDir
Source files flat file source file directory
$PMRootDir/SrcFiles
nine
$PMWorkflowLogDir
Workflow logs workflow execution log directory
$PMRootDir/WorkflowLogs.
ten
$PMLookupFileDir
Cache directory generated by Lookup files lookup
$PMRootDir/LkpFiles.
eleven
$PMTempDir
Temporary file directory
$PMRootDir/Temp
twelve
$PMStorageDir
When HA, record the running status of workflow
$PMRootDir/Storage.
7 naming rules 7.1 General rules
The following elements, database tables, field names, function names, function expressions, and SQL statements are all in uppercase letters.
7.2 Connection data Source connection 7.2.1 Connection data connection naming
Data links are divided into source database links and target database links, ETL E (extraction) and L (load) links.
Database connection methods can be divided into Native and ODBC:
1) Native uses the client connection of the corresponding data to extract and load data, such as oracle, DB2, etc.
2) ODBC uses DataDirect ODBC to connect to the database, such as mysql and MSSQL.
The data connection is named as DataBaseType_ProductDatabaseSID_ETLUSER.
Description: DataBaseType is the data source type, and the SID,ETLUser of ProductDatabaseSID production database is the user who synchronizes with ETL data.
For example: Ora_ASURE_BILETL, the connection method is Native, and Ora indicates that the data type is Oracle,ASURE, the Asura production database, SID,BILETL, and the user who operates ETL.
For example, ODBC_ Mysql_ASURE_BILETL,ODBC means to connect in an ODBC way. Mysql is the database type, ASURE is the Asura system, and BILETL is the ETL operating user.
7.2.2 Database type corresponding abbreviation
Table 5-1 abbreviations of database types
Serial number
Data source type
Abbreviations
one
Oracle
Ora_
two
DB2
DB2_
three
Mysql
Mysql_
four
Microsoft SQL Server
MSSQL_
five
Sybase
Sybase_
six
Greenplum
GP_
seven
Teradata
TD_
eight
ODBC
ODBC_DataType_
7.2.3 SID of the production database
Table 5-2 Database information table
Serial number
Chinese name of database
Database SID
Remarks
one
Asura system
ASURE
two
New vehicle management system
VMS
three
Short message system
SMSDB
four
7.3 component naming
Table 5-3 naming prefixes for common components
Serial number
Component name
Icon
Naming convention
Meaning
one
Source Qualifier
Sq_
Read data from a data source
two
Expression
Exp_desc
Row-level conversion
three
Filter
Fil_
Data filtering
four
Sorter
Sort_
Data sorting
five
Aggregator
Agg_
Polymerization
six
Joiner
Jnr_
Heterogeneous data connection
seven
Lookup
Lkp_
Query connection
eight
Update Strategy
Ust_
Edit insert, update, delete, reject for the target
nine
Router
Rot_
Conditional distribution
ten
Sequence Generator
Sqg_
Serial number generator
eleven
Normalizer
Nrm_
Record standardization
twelve
Rank
Rnk_
TOPx the record
thirteen
Union
Uni_
Data merging
fourteen
Transaction Control
Tc_
Conditional transaction control of loading data
fifteen
Stored Procedure
Sp_
Stored procedure component
sixteen
Custom
Cus_
User-defined component
seventeen
HTTP
Http_
WWW component
eighteen
Java
Java_
Java self-programming component
7.4 Folder/mapplet/Mapping/Session/Workflow/Schedule naming
Table 5-4 Folder/mapplet/Mapping/Session/Workflow naming conventions
Situation
Name
For example
FOLDER
Public folder
000_Shared
Folder
ProductDatabaseSID_OWNER
SFOSS_ EXP5 (sfoss is the production Asura database sid, and exp5 is the table owner we want to operate)
MAPPLET
MPL_Business Name
MPL_LRNull
MAPPING
Single source and single target
M_Target Table Name
M_TT_WAYBILL
Multi-source and single target
M_Target Table Name
M_TT_WAYBILL
Single source and multiple targets
M_1ToN_Function description
M_1ToN__Broadcost
Multiple sources and multiple targets
M_NToN_Function description
M_NToN_Gather
SESSION
Reusable post_S_
Post_S_ mapping name
Post_S_M_STGOMS_ORDERS
Reusable pre_S_
Pre_S_ mapping name
Pre_S_M_STGOMS_ORDERS
Single mapping single session
S_mapping name
S_M_STGOMS_ORDERS
Single mapping and multiple session
S_mapping name_ regions / subsystems
S_M_STGOMS_ORDERS_BJ
S_M_STGOMS_ORDERS_GX
(BJ stands for Beijing, GX body surface Guangxi)
WORKFLOW
Single mapping single session
WF_mapping name
WF_STGOMS_ORDERS
Single mapping and multiple session
WF_mapping name
WF_STGOMS_ORDERS
Multiple mapping and multiple session
WF_function description
WF_UpdateUsersAndGroups
Schedule
SCHDL_ run interval _ (run time) _ (due time)
Runs every 5 minutes and expires on May 6, 2014
SCHDL_5MIN_Stop20140506
Run every 5 minutes and never expire
SCHDL_5MIN_FOREVER
Runs at 21:30 every day and never expires
SCHDL_1Day_AT2130_FOREVER
It runs at 21:30 on the 4th of every month and never expires.
SCHDL_1MON_4THAT2130__FOREVER
8 create a Connection connection
The creation of the Connection is done by the Informatica administrator, but the developer has permission to modify the Connection property in the development environment and testing.
Take the creation of Oracle Connection "Ora_ASURE_SFMAP" as an example.
Log in to the Informatica server to see if the corresponding SID "ASURE" has been added to the tnsname.ora file, otherwise add it to the tnsname.ora
Log in to Workflow Managerable connection à Relational à Select Type = "Oracle" à New... (button)
8Mui 1 creates Connection
To modify the permissions of the Connection connection, log in to Workflow Managerable connection à Relational à Objects: select the Connection connection à Permission that needs to be modified. (button) à modify the owner. Give permission to execute to the Others group.
8Mui 2 modifies Connection owner
9 create a folder
Before creating a folder, you need to create a user ETLMGR in the target data. The script is in the folder "ETLMGR". Please execute it in order.
9.1 create a folder
Log in to Repository Manager and refer to the naming rules in Chapter 5 to create a folder
Action: folderboat Createà enter the folder name in the pop-up dialog box
Do not close the dialog box to proceed to the next step
Select the owner of the new folder
9mur1 Select folder owner
9.2 copy shared objects
This part of the operation is completed by the developer.
Log in to Repository Manager and drag Mapping "M_GetParam" and "M_getSessionRunStatus" under the folder "000_Shared" to the new folder, and select "OK all" in the pop-up create Quick Link dialog box.
9mur2 create a shared Mapping shortcut link
Log in to Workflow Manager to open the newly created folder, drag and drop Session "pre_S" and "post_S" under the folder "000_Shared" to the new folder, and select "OK" in the pop-up copy dialog box.
9mur3 copy and share Session
Then handle the Mapping conflict and reselect the corresponding shortcut for the Mapping that cannot be found
9mur4 Mapping conflict handling
Prompt for the completion of conflict processing to select the next step and confirm to complete this step
Log in to Workflow Manager to open the newly created folder and edit the Connection (connection) of the "post_S" and "pre_S" source and destination.
9mur5 Editing "post_S"
9mur6 Editing "pre_S"
10 Mapping design
Note 1: before Mapping, you need to insert information related to the target table in the ETLMGR.ETL_INCR_PARAM of the target table database. The specific content may be obtained by querying ETLMGR.ETL_DICT for help.
Note that the value of 2:ETLMGR.ETL_INCR_PARAM.TIME_BEFORE_NOW (the time difference between the incremental end time and the current time, in seconds) is not less than 300 to avoid the loss of data that is too late for commit.
Log in to Designer and all the operations described in this chapter are on the Designer client.
10.1 Import the table structure of the source and destination
Import source table structure
Create an ODBC connection to the source table
EnableNcharSupport: it is unchecked by default. Importing char,varchar,varchar2 without a check will become nchar,nvarchar,nvarchar2.
The ODBC connection is only used as a medium for importing the structure of the source table and the target table, and the actual data will not be processed. The actual data processing is completed by the server Connection.
Import the structure of the source table from the ODBC connection
Import target table structure
The structure of the import target table is similar to that of the import source table, but first switch to the editing window of the target table
10.2 Mapping Design
When developing, pay attention to the Folder display at the top left of the client, and be sure to work in the correct Folder.
Create Mapping
Expand the folder "000_Sample", locate and drag Mapping "M_for_copy" to the destination folder, and select "Yes" in the pop-up dialog box.
Rename "M_for_copy" and add comments according to the naming convention in Chapter 5
Operation: menu Mapping à Edit
The contents of Comment (Notes) are as follows:
Create date: date
Create by: user name (real Chinese name)
Desc: decription example: user information exchange
[
Modify Date:
Modify by: user
Desc: decription example: expired user information is no longer exchanged
]
All Mapping add four parameters (Parameter)
Action: created in the process of copying Mapping
$INCR_START_DT STRING (20) DEFAULT: 1900-01-01 00:00:00 à increment start time (represents the beginning of the increment interval)
$INCR_END_DT STRING (20) DEFAULT: 2900-01-01 00:00:00 à incremental due time (represents the end of the incremental range)
$INCR_START_ID DECIMAL (1910) DEFAULT: 0 à increment starting primary key value
$INCR_END_ID DECIMAL (1910) DEFAULT: 9999999999999999999999999999 à increment up to the primary key value
Mapping specific Design reference document "Informatica Awareness scene Design"
L when creating variables in a component, pay attention to select the data type, select the length, keep the data type consistent when assigning values to variables or assigning variables to fields, and use explicit type conversions when inconsistent.
When doing character processing, pay attention to NULL, the difference between empty strings and spaces, and different interpretations and processing methods.
L uncheck OutputPort for ports that do not need to be output. When there is a field with the same name in the component, the input field adds the'_ IN', variable after the original field, and the output field name is kept consistent with the input field name of the next component as far as possible, so that it can be automatically connected by name.
L data loading method: full volume, increment
Full quantity: Truncate & Insert
Time-based increment (see Informatica common scenario design for specific implementation)
Increment based on primary key (see Informatica common scenario design for specific implementation)
Use increments as far as possible for all Mapping requirements (the increment interval can be determined by time first, or by the primary key when there is no timestamp)
Full extraction for data source tables that really cannot provide incremental time or primary keys. Full extraction is only applicable to tables with a small amount of data. if the amount of data in the data table is particularly large, it is necessary to redefine the requirements with the demand side.
10.3 Design instructions for common components
The l Source Qualifier component uses:
When using nvarchar2 for the source system, make sure to use nstring to match it in mapping when importing, so as to ensure the correctness of character transmission, otherwise there will be problems such as garbled code and character truncation.
If there are two or more Source tables stored in Mapping and they are from the same source system, try to use Source Qualifier for association and add association conditions to Source Qualifier instead of using Joiner controls
Add incremental conditions to Source Qualifier:
Timestamp field > = TO_DATE ($$INCR_START_DT,'YYYYMMDD HH24:MI:SS') And
Timestamp field
< TO_DATE($$INCR_END_DT,'YYYYMMDD HH24:MI:SS') (注意这里是小于,而不是小于等于) 或 主键字段>TO_DATE ($$INCR_START_ID,'YYYYMMDD HH24:MI:SS') And
Primary key field
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.