Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the skills of Python parallel acceleration

2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "what are the skills of Python parallel acceleration". Many people will encounter such a dilemma in the operation of actual cases, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1 preface

When we use Python for all kinds of data calculation and processing tasks, if we want to achieve obvious computing acceleration effect, the simplest and most straightforward way is to expand the task that runs on a single process by default to multi-process or multi-thread execution.

For those of us engaged in data analysis, it is particularly important to achieve equivalent acceleration in the simplest way, so as to avoid spending too much time writing programs.

Today I'm going to take you to learn how to use the relevant functions of joblib, a very easy-to-use library, to quickly achieve parallel computing acceleration.

2 using joblib for parallel computing

As a widely used third-party Python library (for example, joblib is widely used in the framework of scikit-learn items for parallel acceleration of many machine learning algorithms), we can install it using pip install joblib. After the installation is complete, let's learn about the common methods of parallel computing in joblib:

2.1 parallel acceleration using Parallel and delayed

The implementation of parallel computing in joblib only needs to use its Parallel and delayed methods, which is very simple and convenient to use.

Let's demonstrate it directly with a small example:

The idea of parallel computing in joblib is to schedule a group of serial computing subtasks generated by loops in a multi-process or multi-threaded manner. For custom computing tasks, all we need to do is encapsulate them in the form of functions, such as:

Import timedef task_demo1 (): time.sleep (1) return time.time ()

Then, as in the following form, after setting the relevant parameters for Parallel (), the link loop creates the list derivation process of the subtasks, in which delayed () is used to wrap the custom task function, and then the parameters required by the task function are passed, where the n_jobs parameter is used to set the number of worker for parallel tasks to execute at the same time, so in this example, you can see that the progress bar is increased in groups of four.

You can see that the final time overhead also achieves the effect of parallel acceleration:

The parameters can be adjusted for Parallel () according to the computing task and the number of CPU cores of the machine. The core parameters are:

Backend: used to set the parallel mode, where the multi-process mode has two options: 'loky' (more stable) and' multiprocessing', and multi-threading has an option of 'threading'. The default is' loky'

N_jobs: used to set the number of worker for parallel tasks to execute at the same time. When the parallel mode is multi-process, n_jobs can be set to the maximum number of machine CPU logical cores, which is equivalent to opening all logical cores. You can also set it to-1 to quickly open all logical cores. If you do not want all CPU resources to be occupied by parallel tasks, you can set a smaller negative number to retain appropriate idle cores. For example, set to-2 to enable all cores-1 core, and set to-3 to enable all cores-2 cores

For example, in the following example, on my machine with 8 logical cores, two cores are reserved for parallel computing:

With regard to the choice of parallel mode, due to the limitation of global interpreter locks when multithreading in Python, if your task is computationally intensive, it is recommended to use the default multiprocess acceleration. If your task is IO-intensive, such as file reading and writing, network requests, etc., multithreading is a better way and you can set the n_jobs to be very large. As a simple example, you can see that through multithreading parallelism We completed 1000 requests in 5 seconds, much faster than the 100 requests made by a single thread in 17 seconds.

We can make good use of joblib to speed up your daily work according to our actual tasks.

This is the end of the content of "what are the skills of Python parallel acceleration". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report