In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
When making batch updates to the database, using the update function performs better than the execute function.
For example, source and target are two different data sources, callrecordA is a physical table in source with 10, 000 records, and callrecordB is a physical table in target with no records. Now you need to synchronize the data in callrecordA to callrecordB.
When you use the execute function for batch updates, you can find that the performance is not satisfactory. The code is as follows:
AB1=connect ("source") / connect source2=A1.query@x ("select * from callrecordA") / fetch callrecordA3
4=connect ("target") / connection target5=A4.execute ("truncate table callrecordB") / to facilitate repeated testing, clear callrecordB6=now () / record the test time 7=A2. (A4.execute ("insert into callrecordB values", ~ .SERIALNUMBER, ~ .OUTID, ~ .INID, ~ .CALLTIME, ~ .DURATION, ~ .CHARGE)) / batch update 8=A4.close ()
9=interval@ms (A6Jing now ()) / SQL execute statement, 8251ms
The above algorithms actually execute insert statements in a loop, and do not make full use of the batch update mechanism of JDBC, so the performance is poor. And the above algorithm is only to insert data, if there are both inserts and modifications, it will be a lot of trouble to deal with.
After using the update function, you can see that the performance has improved significantly. The code is as follows:
11=connect ("target") / connection target12=A11.execute ("truncate table callrecordB") / to facilitate repeated tests, first clear callrecordB13=now () / record test time 14=A11.update@i (A2 call record number) / batch update 15=A11.close ()
16=interval@ms (A13Jing now ()) / SPL update function, 2075ms
The function update uses JDBC's executeBatch mechanism for batch updates, so the performance is better. The option @ I means that you only need to generate insert statements, and @ u means that only update statements are generated. If there is no option, it means that there are both insert and update, that is, if the target table already has data, you need to compare the primary key of the source table and the target table. If a primary key exists in the source table but not in the target table, an insert statement is generated. If a primary key exists in both the source table and the target table, a update statement is generated.
Databases usually provide synchronization / import tools, which usually have better performance and support command-line calls. SPL executes the command line, so you can call these tools. For example, to import data using oracle's sqlldr, you can use the following code:
AB1=connect ("source") / connection source2=A1.query@x ("select * from callrecordA") / take callrecordA3=file ("d:\\ temp\\ callrecordB.txt") .export (A2; "|") / generate text / csv file with delimiter | 4=system ("cmd / C sqlldr system/runqian@orcl control=d:\\ temp\\ callrecordB.ctl data=d:\\ temp\\ callrecordB.txt log=log.log bad=bad.log errors=5000 rows=1000 bindsize=10485760") / execute command line
A1-A3: export the table in source to a text file with the delimiter |. If the text file already exists, you can omit this step.
A4: execute the command line, call sqlldr, and the command format should meet the official requirements of sqlldr. Note that callrecordB.ctl is a control file required by sqlldr in the following format:
Load data
CHARACTERSET UTF8
Append
Into table callrecordB
Fields terminated by'|'
Trailing nullcols
(
SERIALNUMBER INTEGER EXTERNAL
OUTID INTEGER EXTERNAL
INID INTEGER EXTERNAL
CALLTIME date "yyyy-mm-dd hh34:mi:ss"
DURATION INTEGER EXTERNAL
CHARGE INTEGER EXTERNAL
)
SPL supports parallel computing, including parallel execution of synchronization / import tools, so you can split a single text into multiple text and import multiple files at the same time for higher performance. Different synchronization / import tools have different parallel requirements, the usual requirement is that the table is not locked and there is no unique index. For example, to execute sqlldr in parallel, you can use the following code:
ABC1=file ("d:\\ temp\\ callrecordB.txt")
/ Open single file 2x2. (file ("d:\ temp\\ callrecordB", ~, ".txt") .export (A1.cursor (; ~: 2, "|")) / split into multiple files 3fork to (2) = system ("cmd / C sqlldr system/runqian@orcl control=d:\\ temp\ callrecordB.ctl data=d:\ temp\ callrecordB", A3, ".txt direct=y parallel=true log=log.log bad=bad.log errors=5000 bindsize=10485760") / multiple files are imported in parallel
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.