In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Text parallelism
SPL can roughly divide the text file into N segments by volume and read only one of them. For example, cardInfo.txt stores 10 million pieces of population information, divides it into ten parts, takes the second, and the code can be written:
AB1=file ("d:\\ temp\\ cardInfo. Txt")
2=A1.import@t (; 2:10) / read directly into memory 3=A1.cursor@t (; 2:10). Fetch@x () / cursor read
Roughly segmented by volume, rather than accurately segmented by the number of rows, in order to improve segmenting performance. For example, if you look at the first few fields of A2 or A3 in IDE, you can see that the number of rows is not exactly 1 million (depending on the specific data):
IndexcardNonamegenderprovincemobile1308200310180525Alison ClintonfemaleIdaho10246274902709198311300191Abby WoodfemaleKansas1966846631005199807060610George BushmaleCalifornia1019879226... 1000005405199907050256Mark RowswellmaleIdaho1168620176
Segmented reading can be applied to multithreaded computing to improve read performance. For example, if you use two threads to read cardInfo.txt, each thread calculates the number of lines in this segment, and finally merges them into the total number of lines, you can use the following code:
5fork to (2) = A1.cursor@t (; A5count 2) .total (count (1)) / 2 thread segmented 6=A5.sum ()
/ merge result
The statement fork statement is suitable for the situation where the algorithm is more complex. When the algorithm is relatively simple, it can be read by segments directly with cursor@m. For example, the previous code can be rewritten as follows:
7=A1.cursor@tm (; 2) .total (count (1)) / 2 thread segmentation
The above code specifies the number of threads, and if the number of threads is omitted, use "parallet limit" in the configuration file as the default number of threads. Assuming parallet limit=2, the above code can be rewritten as follows:
8=A1.cursor@tm () .total (count (1)) / default thread segmentation
In order to verify the performance difference before and after segmented reading, an algorithm is designed to calculate the total number of rows of cardInfo.txt with single thread and 2 threads respectively. You can see a significant improvement in performance:
11=now ()
12=A1.cursor@t () .total (count (1))
13=interval@ms (A11 now ()) / unsegmented, 20882ms14
15=now ()
16=A1.cursor@tm (; 2) .total (count (1))
17=interval@ms (A15 now ()) / 2 Thread Segmentation, 12217msJDBC parallel
When fetching data through JDBC, it is sometimes encountered that although the database load is not heavy, the performance of fetching is still poor. In this case, parallel fetching can be used to improve performance.
For example, the Oracle database has a call record table callrecord, the number of records is 1 million, the index field is callTime, and the data is basically distributed evenly according to this field. When using non-parallel fetching, you can find that the performance is not satisfactory. The code is as follows:
AB1=now () / record time for testing performance 2=connect ("orcl")
3=A2.query@x ("select * from callrecord")
4=interval@ms (A1 now ()) / non-parallel fetch, 17654ms
After changing to 2-thread parallel fetch, you can see that the performance has improved significantly. The code is as follows:
6=now ()
7=connect ("orcl"). Query@x ("select min (callTime), max (callTime) from callrecordA") 8: 2. (range (A7.room1 callTime) elapseurs (A7.room2), ~: 2) / time interval parameter list 9fork A8=connect ("orcl") 10
= B9.query@x ("select * from callrecordA where callTime > =? and callTime
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.