In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
Today, I will talk to you about how to customize ForkJoinPool to improve the execution speed of parallel flow ParallelStream. Many people may not know much about it. In order to make you understand better, the editor has summarized the following contents for you. I hope you can get something according to this article.
Brief introduction
A streaming Stream has been added to java8, which allows you to process data in a declarative way. It is very simple and elegant to use. ParallelStream is a parallel execution stream, which uses ForkJoinPool to execute tasks in parallel to improve the execution speed.
Let's look at two simple examples:
Example 1 (list) Arrays.asList (1parallelStream () .forEach ((value)-> {String name = Thread.currentThread (). GetName (); System.out.println ("example 1 Thread:" + name + "value:" + value);})
Example 2 (array) Stream.of. Parallel () .forEach ((value)-> {String name = Thread.currentThread (). GetName (); System.out.println ("example 2 Thread:" + name + "value:" + value);}); problems arise
The author has been doing some crawler-related business recently, and its core tool has been opened by mica-http: https://gitee.com/596392912/mica/tree/master/mica-http, which has developed into a powerful non-account crawler tool after two versions of iteration. Let's try it.
We have collected a large number of proxy ip for use by crawlers, including a scheduled task to check whether the agent is invalid every 5 minutes, and proxy ip detection is time-consuming. We set a timeout of 2s for each detection request, so 1000 ip in a single thread will take more than half an hour. Of course, the author uses parallel Stream in verification to simplify development.
Then it was found that the effect was not obvious, and the number of proxy ip could not be detected in 5 minutes, which led to the accumulation of tasks. Why not significantly improve the execution speed when using parallel streams?
Let's take a look at the information printed by the "example" just now:
Example 1 Thread:main value:4 example 1 Thread:ForkJoinPool.commonPool-worker-2 value:1 example 1 Thread:main value:6 example 1 Thread:ForkJoinPool.commonPool-worker-2 value:5 example 1 Thread:main value:3 example 1 Thread:ForkJoinPool.commonPool-worker-1 value:2 sample 2 Thread:main value:4 sample 2 Thread:ForkJoinPool.commonPool-worker-3 value:3 example 2 Thread:ForkJoinPool.commonPool-worker-2 value:5 example 2 Thread:ForkJoinPool.commonPool- Worker-4 value:1 sample 2 Thread:ForkJoinPool.commonPool-worker-5 value:2 example 2 Thread:ForkJoinPool.commonPool-worker-1 value:6
We can see that Parallel Stream uses a ForkJoinPool.commonPool thread pool by default, so even if we use Parallel Stream and the whole jvm shares a common pool thread pool, tasks pile up accidentally, and concurrent streams are also widely used in other tasks such as the collection agent when verifying the agent ip, which confirms why tasks are piled up.
Solve the problem
Use custom ForkJoinPool execution speed. The sample code is as follows:
/ / example: custom thread pool ForkJoinPool forkJoinPool = new ForkJoinPool (8); / / here is a batch of agents ipList records = new ArrayList () found in the database; / / find the failed agents ipList needDeleteList = forkJoinPool.submit (()-> records.parallelStream () .map (ProxyList::getIpPort) .filter (IProxyListTask::isFailed) .filter (Collectors.toList ()). Join (); / / delete the failed agents.
The whole code is still elegant, and the execution speed has been significantly improved after using the custom ForkJoin thread pool. Tasks that could not be completed in the past five minutes can now be completed in two minutes.
Conclusion
The concurrent stream of java8 can simplify the use of multithreading when processing large quantities of data. In case of time-consuming business or heavy use of concurrent stream, you might as well use a custom thread pool to indicate the processing speed according to the business situation.
After reading the above, do you have any further understanding of how to customize ForkJoinPool to improve the execution speed of parallel flow ParallelStream? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.