Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the INFORMATICA development specifications?

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

What are the INFORMATICA development specifications? I believe many inexperienced people are at a loss about this. Therefore, this article summarizes the causes and solutions of the problem. Through this article, I hope you can solve this problem.

Definition of responsible person for ETL R & D Operation and maintenance

1) the transfer of business data within the information department shall adopt the principle of user research and development, that is, who will access the data and who will be responsible for the research and development of the ETL version. The operation and maintenance personnel of the target system carry out operation and maintenance.

2) for the independent management system of business users, there is no special research and development, and the data fetching source R & D is responsible for the research and development of informatica version push data. The source-end system is responsible for the corresponding workflow operation and maintenance personnel.

3 operational constraints of ETL R & D database

1) for incrementally synchronized tables, primary keys are required for the source and target tables.

2) an index is required on the incremental field

3) incremental fields (such as time conditions, must enter the database in strict order, or strictly ensure that no data will be entered in the fetching range of incremental synchronization after the completion of incremental synchronization)

Synchronize the scene

Synchronization requirement

Synchronization type

Whether there is a primary key or unique key in the source and target tables

Whether there is an index on the incremental field (source table and target table)

Update synchronization

Delte synchronization

Identified field increment

There is competition or index

An index needs to exist.

The updatetime field needs to exist and update field contention is required for update update synchronization.

The deleted data needs to be written to the temporary table. After synchronizing the temporary table at the same time, the etl deletes the corresponding data in the target database, and the source side also needs to clean the temporary table regularly.

Full read synchronization of source table

There are no constraints or indexes

N/A

N/A

N/A

Full read synchronization of source table

There is competition or index

N/A

N/A

N/A

4 definition

Serial number

Terminology or acronym

Illustrative definition

one

ETL

Extraction-Transformation-Loading, data loading

two

Source

Source

three

Target

target

four

Transformation

module

five

Mapping

Data mapping

six

Mapplet

Data mapping set, reusable Transformation combination

seven

Session

Perform a task

eight

Worklet

Data working set

nine

Workflow

Data workflow

ten

Schedule

Dispatching frequency

eleven

Parameter

Parameters.

twelve

ETLUser

Use database users synchronized with ETL data

thirteen

ProductDatabaseSID

Production system database SID

5 range

Readers of this document include:

L Project Manager

L system administrator

L DBA administrator

L developer

L tester

L operation and maintenance personnel

The technologies to be used in this project:

L ETL data integration and transformation: Informatica

L operating system: Linux, Windows

L database: Oracle, Mysql, DB2, MS SQLServer, etc.

6 system general properties

Service Variable

Description

Value

one

$PMRootDir

Infa_share root directory

\ server\ infa_shared

two

$PMSessionLogDir

Session run log directory

$PMRootDir/SessLogs.

three

$PMBadFileDir

Reject files rejects file directory

$PMRootDir/BadFiles.

four

$PMCacheDir

Temporary cache files

$PMRootDir/Cache

five

$PMTargetFileDir

Target files target file generation directory

$PMRootDir/TgtFiles

six

$PMSourceFileDir

Source files flat file source file directory

$PMRootDir/SrcFiles

nine

$PMWorkflowLogDir

Workflow logs workflow execution log directory

$PMRootDir/WorkflowLogs.

ten

$PMLookupFileDir

Cache directory generated by Lookup files lookup

$PMRootDir/LkpFiles.

eleven

$PMTempDir

Temporary file directory

$PMRootDir/Temp

twelve

$PMStorageDir

When HA, record the running status of workflow

$PMRootDir/Storage.

7 naming rules 7.1 General rules

The following elements, database tables, field names, function names, function expressions, and SQL statements are all in uppercase letters.

7.2 Connection data Source connection 7.2.1 Connection data connection naming

Data links are divided into source database links and target database links, ETL E (extraction) and L (load) links.

Database connection methods can be divided into Native and ODBC:

1) Native uses the client connection of the corresponding data to extract and load data, such as oracle, DB2, etc.

2) ODBC uses DataDirect ODBC to connect to the database, such as mysql and MSSQL.

The data connection is named as DataBaseType_ProductDatabaseSID_ETLUSER.

Description: DataBaseType is the data source type, and the SID,ETLUser of ProductDatabaseSID production database is the user who synchronizes with ETL data.

For example: Ora_ASURE_BILETL, the connection method is Native, and Ora indicates that the data type is Oracle,ASURE, the Asura production database, SID,BILETL, and the user who operates ETL.

For example, ODBC_ Mysql_ASURE_BILETL,ODBC means to connect in an ODBC way. Mysql is the database type, ASURE is the Asura system, and BILETL is the ETL operating user.

7.2.2 Database type corresponding abbreviation

Table 5-1 abbreviations of database types

Serial number

Data source type

Abbreviations

one

Oracle

Ora_

two

DB2

DB2_

three

Mysql

Mysql_

four

Microsoft SQL Server

MSSQL_

five

Sybase

Sybase_

six

Greenplum

GP_

seven

Teradata

TD_

eight

ODBC

ODBC_DataType_

7.2.3 SID of the production database

Table 5-2 Database information table

Serial number

Chinese name of database

Database SID

Remarks

one

Asura system

ASURE

two

New vehicle management system

VMS

three

Short message system

SMSDB

four

7.3 component naming

Table 5-3 naming prefixes for common components

Serial number

Component name

Icon

Naming convention

Meaning

one

Source Qualifier

Sq_

Read data from a data source

two

Expression

Exp_desc

Row-level conversion

three

Filter

Fil_

Data filtering

four

Sorter

Sort_

Data sorting

five

Aggregator

Agg_

Polymerization

six

Joiner

Jnr_

Heterogeneous data connection

seven

Lookup

Lkp_

Query connection

eight

Update Strategy

Ust_

Edit insert, update, delete, reject for the target

nine

Router

Rot_

Conditional distribution

ten

Sequence Generator

Sqg_

Serial number generator

eleven

Normalizer

Nrm_

Record standardization

twelve

Rank

Rnk_

TOPx the record

thirteen

Union

Uni_

Data merging

fourteen

Transaction Control

Tc_

Conditional transaction control of loading data

fifteen

Stored Procedure

Sp_

Stored procedure component

sixteen

Custom

Cus_

User-defined component

seventeen

HTTP

Http_

WWW component

eighteen

Java

Java_

Java self-programming component

7.4 Folder/mapplet/Mapping/Session/Workflow/Schedule naming

Table 5-4 Folder/mapplet/Mapping/Session/Workflow naming conventions

Situation

Name

For example

FOLDER

Public folder

000_Shared

Folder

ProductDatabaseSID_OWNER

SFOSS_ EXP5 (sfoss is the production Asura database sid, and exp5 is the table owner we want to operate)

MAPPLET

MPL_Business Name

MPL_LRNull

MAPPING

Single source and single target

M_Target Table Name

M_TT_WAYBILL

Multi-source and single target

M_Target Table Name

M_TT_WAYBILL

Single source and multiple targets

M_1ToN_Function description

M_1ToN__Broadcost

Multiple sources and multiple targets

M_NToN_Function description

M_NToN_Gather

SESSION

Reusable post_S_

Post_S_ mapping name

Post_S_M_STGOMS_ORDERS

Reusable pre_S_

Pre_S_ mapping name

Pre_S_M_STGOMS_ORDERS

Single mapping single session

S_mapping name

S_M_STGOMS_ORDERS

Single mapping and multiple session

S_mapping name_ regions / subsystems

S_M_STGOMS_ORDERS_BJ

S_M_STGOMS_ORDERS_GX

(BJ stands for Beijing, GX body surface Guangxi)

WORKFLOW

Single mapping single session

WF_mapping name

WF_STGOMS_ORDERS

Single mapping and multiple session

WF_mapping name

WF_STGOMS_ORDERS

Multiple mapping and multiple session

WF_function description

WF_UpdateUsersAndGroups

Schedule

SCHDL_ run interval _ (run time) _ (due time)

Runs every 5 minutes and expires on May 6, 2014

SCHDL_5MIN_Stop20140506

Run every 5 minutes and never expire

SCHDL_5MIN_FOREVER

Runs at 21:30 every day and never expires

SCHDL_1Day_AT2130_FOREVER

It runs at 21:30 on the 4th of every month and never expires.

SCHDL_1MON_4THAT2130__FOREVER

8 create a Connection connection

The creation of the Connection is done by the Informatica administrator, but the developer has permission to modify the Connection property in the development environment and testing.

Take the creation of Oracle Connection "Ora_ASURE_SFMAP" as an example.

Log in to the Informatica server to see if the corresponding SID "ASURE" has been added to the tnsname.ora file, otherwise add it to the tnsname.ora

Log in to Workflow Managerable connection à Relational à Select Type = "Oracle" à New... (button)

8Mui 1 creates Connection

To modify the permissions of the Connection connection, log in to Workflow Managerable connection à Relational à Objects: select the Connection connection à Permission that needs to be modified. (button) à modify the owner. Give permission to execute to the Others group.

8Mui 2 modifies Connection owner

9 create a folder

Before creating a folder, you need to create a user ETLMGR in the target data. The script is in the folder "ETLMGR". Please execute it in order.

9.1 create a folder

Log in to Repository Manager and refer to the naming rules in Chapter 5 to create a folder

Action: folderboat Createà enter the folder name in the pop-up dialog box

Do not close the dialog box to proceed to the next step

Select the owner of the new folder

9mur1 Select folder owner

9.2 copy shared objects

This part of the operation is completed by the developer.

Log in to Repository Manager and drag Mapping "M_GetParam" and "M_getSessionRunStatus" under the folder "000_Shared" to the new folder, and select "OK all" in the pop-up create Quick Link dialog box.

9mur2 create a shared Mapping shortcut link

Log in to Workflow Manager to open the newly created folder, drag and drop Session "pre_S" and "post_S" under the folder "000_Shared" to the new folder, and select "OK" in the pop-up copy dialog box.

9mur3 copy and share Session

Then handle the Mapping conflict and reselect the corresponding shortcut for the Mapping that cannot be found

9mur4 Mapping conflict handling

Prompt for the completion of conflict processing to select the next step and confirm to complete this step

Log in to Workflow Manager to open the newly created folder and edit the Connection (connection) of the "post_S" and "pre_S" source and destination.

9mur5 Editing "post_S"

9mur6 Editing "pre_S"

10 Mapping design

Note 1: before Mapping, you need to insert information related to the target table in the ETLMGR.ETL_INCR_PARAM of the target table database. The specific content may be obtained by querying ETLMGR.ETL_DICT for help.

Note that the value of 2:ETLMGR.ETL_INCR_PARAM.TIME_BEFORE_NOW (the time difference between the incremental end time and the current time, in seconds) is not less than 300 to avoid the loss of data that is too late for commit.

Log in to Designer and all the operations described in this chapter are on the Designer client.

10.1 Import the table structure of the source and destination

Import source table structure

Create an ODBC connection to the source table

EnableNcharSupport: it is unchecked by default. Importing char,varchar,varchar2 without a check will become nchar,nvarchar,nvarchar2.

The ODBC connection is only used as a medium for importing the structure of the source table and the target table, and the actual data will not be processed. The actual data processing is completed by the server Connection.

Import the structure of the source table from the ODBC connection

Import target table structure

The structure of the import target table is similar to that of the import source table, but first switch to the editing window of the target table

10.2 Mapping Design

When developing, pay attention to the Folder display at the top left of the client, and be sure to work in the correct Folder.

Create Mapping

Expand the folder "000_Sample", locate and drag Mapping "M_for_copy" to the destination folder, and select "Yes" in the pop-up dialog box.

Rename "M_for_copy" and add comments according to the naming convention in Chapter 5

Operation: menu Mapping à Edit

The contents of Comment (Notes) are as follows:

Create date: date

Create by: user name (real Chinese name)

Desc: decription example: user information exchange

[

Modify Date:

Modify by: user

Desc: decription example: expired user information is no longer exchanged

]

All Mapping add four parameters (Parameter)

Action: created in the process of copying Mapping

$INCR_START_DT STRING (20) DEFAULT: 1900-01-01 00:00:00 à increment start time (represents the beginning of the increment interval)

$INCR_END_DT STRING (20) DEFAULT: 2900-01-01 00:00:00 à incremental due time (represents the end of the incremental range)

$INCR_START_ID DECIMAL (1910) DEFAULT: 0 à increment starting primary key value

$INCR_END_ID DECIMAL (1910) DEFAULT: 9999999999999999999999999999 à increment up to the primary key value

Mapping specific Design reference document "Informatica Awareness scene Design"

L when creating variables in a component, pay attention to select the data type, select the length, keep the data type consistent when assigning values to variables or assigning variables to fields, and use explicit type conversions when inconsistent.

When doing character processing, pay attention to NULL, the difference between empty strings and spaces, and different interpretations and processing methods.

L uncheck OutputPort for ports that do not need to be output. When there is a field with the same name in the component, the input field adds the'_ IN', variable after the original field, and the output field name is kept consistent with the input field name of the next component as far as possible, so that it can be automatically connected by name.

L data loading method: full volume, increment

Full quantity: Truncate & Insert

Time-based increment (see Informatica common scenario design for specific implementation)

Increment based on primary key (see Informatica common scenario design for specific implementation)

Use increments as far as possible for all Mapping requirements (the increment interval can be determined by time first, or by the primary key when there is no timestamp)

Full extraction for data source tables that really cannot provide incremental time or primary keys. Full extraction is only applicable to tables with a small amount of data. if the amount of data in the data table is particularly large, it is necessary to redefine the requirements with the demand side.

10.3 Design instructions for common components

The l Source Qualifier component uses:

When using nvarchar2 for the source system, make sure to use nstring to match it in mapping when importing, so as to ensure the correctness of character transmission, otherwise there will be problems such as garbled code and character truncation.

If there are two or more Source tables stored in Mapping and they are from the same source system, try to use Source Qualifier for association and add association conditions to Source Qualifier instead of using Joiner controls

Add incremental conditions to Source Qualifier:

Timestamp field > = TO_DATE ($$INCR_START_DT,'YYYYMMDD HH24:MI:SS') And

Timestamp field

< TO_DATE($$INCR_END_DT,'YYYYMMDD HH24:MI:SS') (注意这里是小于,而不是小于等于) 或 主键字段>

TO_DATE ($$INCR_START_ID,'YYYYMMDD HH24:MI:SS') And

Primary key field

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report