The skill of updating database with aggregator 12/21 Update SLTechnology News&Howtos

The skill of updating database with aggregator

2025-12-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

When making batch updates to the database, using the update function performs better than the execute function.

For example, source and target are two different data sources, callrecordA is a physical table in source with 10, 000 records, and callrecordB is a physical table in target with no records. Now you need to synchronize the data in callrecordA to callrecordB.

When you use the execute function for batch updates, you can find that the performance is not satisfactory. The code is as follows:

AB1=connect ("source") / connect source2=A1.query@x ("select * from callrecordA") / fetch callrecordA3

4=connect ("target") / connection target5=A4.execute ("truncate table callrecordB") / to facilitate repeated testing, clear callrecordB6=now () / record the test time 7=A2. (A4.execute ("insert into callrecordB values", ~ .SERIALNUMBER, ~ .OUTID, ~ .INID, ~ .CALLTIME, ~ .DURATION, ~ .CHARGE)) / batch update 8=A4.close ()

9=interval@ms (A6Jing now ()) / SQL execute statement, 8251ms

The above algorithms actually execute insert statements in a loop, and do not make full use of the batch update mechanism of JDBC, so the performance is poor. And the above algorithm is only to insert data, if there are both inserts and modifications, it will be a lot of trouble to deal with.

After using the update function, you can see that the performance has improved significantly. The code is as follows:

11=connect ("target") / connection target12=A11.execute ("truncate table callrecordB") / to facilitate repeated tests, first clear callrecordB13=now () / record test time 14=A11.update@i (A2 call record number) / batch update 15=A11.close ()

16=interval@ms (A13Jing now ()) / SPL update function, 2075ms

The function update uses JDBC's executeBatch mechanism for batch updates, so the performance is better. The option @ I means that you only need to generate insert statements, and @ u means that only update statements are generated. If there is no option, it means that there are both insert and update, that is, if the target table already has data, you need to compare the primary key of the source table and the target table. If a primary key exists in the source table but not in the target table, an insert statement is generated. If a primary key exists in both the source table and the target table, a update statement is generated.

Databases usually provide synchronization / import tools, which usually have better performance and support command-line calls. SPL executes the command line, so you can call these tools. For example, to import data using oracle's sqlldr, you can use the following code:

AB1=connect ("source") / connection source2=A1.query@x ("select * from callrecordA") / take callrecordA3=file ("d:\\ temp\\ callrecordB.txt") .export (A2; "|") / generate text / csv file with delimiter | 4=system ("cmd / C sqlldr system/runqian@orcl control=d:\\ temp\\ callrecordB.ctl data=d:\\ temp\\ callrecordB.txt log=log.log bad=bad.log errors=5000 rows=1000 bindsize=10485760") / execute command line

A1-A3: export the table in source to a text file with the delimiter |. If the text file already exists, you can omit this step.

A4: execute the command line, call sqlldr, and the command format should meet the official requirements of sqlldr. Note that callrecordB.ctl is a control file required by sqlldr in the following format:

Load data

CHARACTERSET UTF8

Append

Into table callrecordB

Fields terminated by'|'

Trailing nullcols

(

SERIALNUMBER INTEGER EXTERNAL

OUTID INTEGER EXTERNAL

INID INTEGER EXTERNAL

CALLTIME date "yyyy-mm-dd hh34:mi:ss"

DURATION INTEGER EXTERNAL

CHARGE INTEGER EXTERNAL

)

SPL supports parallel computing, including parallel execution of synchronization / import tools, so you can split a single text into multiple text and import multiple files at the same time for higher performance. Different synchronization / import tools have different parallel requirements, the usual requirement is that the table is not locked and there is no unique index. For example, to execute sqlldr in parallel, you can use the following code:

ABC1=file ("d:\\ temp\\ callrecordB.txt")

/ Open single file 2x2. (file ("d:\ temp\\ callrecordB", ~, ".txt") .export (A1.cursor (; ~: 2, "|")) / split into multiple files 3fork to (2) = system ("cmd / C sqlldr system/runqian@orcl control=d:\\ temp\ callrecordB.ctl data=d:\ temp\ callrecordB", A3, ".txt direct=y parallel=true log=log.log bad=bad.log errors=5000 bindsize=10485760") / multiple files are imported in parallel

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.