How to double the processing speed of python 07/13 Update SLTechnology News&Howtos

How to double the processing speed of python

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

The main content of this article is "how to double the processing speed of python", interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how to double the processing speed of python.

Process

During the interview, we all remember the concept that the process is the smallest unit of system resource allocation. Yes, the system consists of programs, that is, processes, which are generally divided into text area, data area and stack area.

The text area stores the code (machine code) executed by the processor, which is generally a read-only area to prevent running programs from being accidentally modified.

The data area stores all variables and dynamically allocated memory, and is subdivided into initialized data areas (all initialized global, static, constant, and external variables) and initialized data areas (global and static variables initialized to 0). The initialized variables are initially saved in the text area, and are copied to the initialized data area after the program starts.

The stack area stores the instructions and local variables of active procedure calls. In the address space, the stack area is closely connected to the stack area, and their growth direction is opposite, and the memory is linear, so our code is placed in the place of low address, growing from low to high. The size of the stack area is unpredictable and can be used at any time, so it is placed in the place of high address, growing from high to low. When the stack and stack pointers coincide, it means that the memory is exhausted, resulting in a memory overflow.

The creation and destruction of processes are relative to system resources, which is very resource-consuming and is a relatively expensive operation. In order to run itself, the process must compete for CPU preemptively. For single-core CPU, only one process can be executed at the same time, so the implementation of multiple processes on single-core CPU is to quickly switch different processes through CPU, which looks like multiple processes are going on at the same time.

Because processes are isolated and each has its own memory resources, compared with the shared memory of threads, it is relatively safe, and the data between different processes can only be shared through IPC (Inter-Process Communication).

Thread

Threads are the smallest unit of CPU scheduling. If the process is a container, the thread is the program running in the container, the thread belongs to the process, and multiple threads of the same process share the memory address space of the process.

Communication between threads can communicate directly through global variables, so communication between threads is relatively insecure, so various lock scenarios are introduced and are not described here.

When a thread crashes, it will cause the whole process to crash, that is, other threads will also crash, but multiple processes will not, one process will fail, and the other process will still run.

In a multicore operating system, there is only one thread in the default process, so the processing of multiple processes is like a process and a core.

Synchronous and asynchronous

Synchronization and asynchronism focus on the message communication mechanism, which means that when a function call is made, the call does not return until the result is obtained. As soon as the call returns, the return value of the execution is immediately obtained, that is, the caller actively waits for the result of the call.

The so-called asynchronous means that after the request is sent, the call returns immediately, no result is returned, and the actual result of the call is informed by callback and so on. Synchronous requests need to actively read and write data and wait for the results; for asynchronous requests, the caller will not get the results immediately. Instead, after the call is made, the callee notifies the caller through status, notification, or handles the call through a callback function.

Blocking and non-blocking

Blocking and non-blocking are concerned with the state of the program while waiting for the result of the call (message, return value).

A blocking call means that the current thread is suspended before the result of the call is returned. The calling thread does not return until it gets the result. A non-blocking call means that the call does not block the current thread until the result is not immediately available. Therefore, the condition of the distinction is whether the data to be accessed by the process / thread is ready and whether the process / thread needs to wait.

Non-blocking is generally realized by multiplexing, which can be realized by select, poll and epoll.

Cooperative process

After understanding the previous concepts, let's look at the concept of collaborative process.

Co-program belongs to thread, also known as micro-thread, fiber, English name Coroutine. For example, when executing function A, I want to interrupt the execution of function B at any time, and then interrupt the execution of B and switch back to execute A. This is the role of the co-program, which is freely switched by the caller. This switching process is not equivalent to a function call because it has no calling statement. Execution is similar to multithreading, but the co-program is executed by only one thread.

The advantage of the cooperative program is that the execution efficiency is very high, because the switching of the cooperative program is controlled by the program itself, and there is no need to switch threads, that is, there is no overhead of switching threads. At the same time, because there is only one thread, there is no conflict problem, and there is no need to rely on locks (locking and releasing locks consume a lot of resources).

The main use scenario of collaborative programs is to deal with IO-intensive programs and solve efficiency problems, which is not suitable for CPU-intensive programs. However, there are many of these two scenarios in real-world scenarios. If you want to make full use of CPU, you can combine multi-process and cooperative programs. We'll talk about the combination point later.

Principle chapter

According to the definition of wikipedia, a co-program is a non-priority subroutine scheduling component that allows subroutines to suspend recovery in characteristic places. So in theory, as long as there is enough memory, there can be any number of co-programs in a thread, but only one co-program can run at a time, and multiple co-programs can share the computer resources allocated by the thread. The purpose of the cooperative program is to take full advantage of asynchronous calls, and the purpose of asynchronous operations is to prevent IO operations from blocking threads.

Knowledge preparation

Before we understand the principle, let's make a preparation for knowledge.

1) almost all modern mainstream operating systems are time-sharing operating systems, that is, a computer uses time-slice rotation to serve multiple users, the basic unit of system resource allocation is process, and the basic unit of CPU scheduling is thread.

2) the runtime memory space is divided into variable area, stack area and heap area. In memory address allocation, the stack area is from low to high, and the stack area is from high to low.

3) when the computer executes, one instruction is read and executed. When the current instruction is executed, the address of the next instruction is in the IP of the instruction register, the ESP register value points to the current stack top address, and the EBP points to the base address of the current active stack frame.

4) when a function call occurs in the system, the operation is as follows: first, press the input parameters from right to left, then press the return address on the stack, and finally press the value of the current EBP register, modify the value of the ESP register, and allocate the space needed for the local variables of the current function in the stack area.

5) the context of the co-program contains the values stored in the stack and registers that belong to the current co-program.

Event cycle

In python3.3, we use the cooperative program through the keyword yield from. In 3.5. we introduce the syntax sugar async and await about the cooperative program. We mainly look at the principle of async/await. Among them, the event loop is a core, and students who have written js will know more about the event loop Eventloop, which is a programming architecture that waits for programs to assign events or messages (Wikipedia). In python, the asyncio.coroutine modifier is used to mark functions as co-routines, which are used with asyncio and its event loops, while async/await is more and more widely used in subsequent developments.

Async/await

Async/await is the key to using the python protocol. Structurally, asyncio is essentially an asynchronous framework, and async/await is the API provided for the asynchronous framework that is convenient for users to call, so if users want to use async/await to write protocol code, they must take the opportunity to asyncio or other asynchronous libraries.

Future

When actually developing and writing asynchronous code, in order to avoid callback hell caused by too many callback methods, but also need to obtain the return results of asynchronous calls, smart language designers designed an object called Future, which encapsulates the interaction with loop. The general execution process is as follows: after the program is started, the callback function is registered with epoll through the add_done_callback method. When the result property gets the return value, the callback function registered before is actively run and passed up to coroutine. The Future object is asyncio.Future.

However, in order to get the return value, the program must return to its working state, and because the life cycle of the Future object itself is relatively short, the work may have been completed after each callback registration, event generation, and trigger callback process, so it is not appropriate to use Future to send result the generator. Therefore, a new object Task is introduced here, which is saved in the Future object to manage the state of the generator co-program.

Another Future object in Python is concurrent.futures.Future, which is incompatible with asyncio.Future and is easy to be confused. The difference is that concurrent.futures is a thread-level Future object that is used to pass results between different thread when multithreaded programming using concurrent.futures.Executor.

Task

As mentioned above, Task is the task object for maintaining the execution logic of generator co-program state processing. There is a _ step method in Task, which is responsible for the state transition of the interaction process between generator co-program and EventLoop. The whole process can be understood as: Task gives a value to the co-program send to restore its working state. When the co-program runs to the breakpoint, a new Future object is obtained, and then the callback registration process of future and loop is processed.

Loop

In daily development, there is a misunderstanding that each thread can have a separate loop. When actually running, the main thread can create a new loop through asyncio.get_event_loop (), while using get_event_loop () in other threads will throw an error. The correct thing to do is to explicitly bind the current thread to the loop of the main thread through asyncio.set_event_loop ().

A big defect of Loop is that the running state of loop is not controlled by Python code, so in business processing, it is not stable to extend the cooperation process to run in multi-threads.

Summary

Actual combat chapter

After introducing the concepts and principles, I'll take a look at how to use it. Here, let's take an example of a real-world scenario to see how to use python's co-program.

Scene

Some files are received externally, and there is a set of data in each file. Among them, this group of data needs to be sent to the third-party platform through http, and the results are obtained.

Analysis.

Because each group of data in the same file has no before and after processing logic, the network request sent through the Requests library before, serial execution, the next group of data transmission needs to wait for the return of the last group of data, it appears that the processing time of the whole file is long, this kind of request mode can be realized by the cooperative program.

In order to make it more convenient to cooperate with the request, we use the aiohttp library instead of the requests library. We don't do too much analysis on aiohttp here, just give a brief introduction.

Aiohttp

Aiohttp is the asynchronous HTTP client / server of asyncio and Python. Because it is asynchronous, it is often used in the server area to receive requests, and client crawler applications to initiate asynchronous requests. Here we are mainly used to send requests.

Aiohttp supports client and HTTP server, can realize single thread concurrent IO operation, supports Server WebSockets and Client WebSockets without using Callback Hell, and has middleware.

Code implementation

Go straight to the code, talk is cheap, show me the code~.

Import aiohttpimport asynciofrom inspect import isfunctionimport timeimport logger@logging_utils.exception (logger) def request (pool, data_list): loop = asyncio.get_event_loop () loop.run_until_complete (exec (pool, data_list)) async def exec (pool, data_list): tasks = [] sem = asyncio.Semaphore (pool) for item in data_list: tasks.append (control_sem (sem Item.get ("method", "GET"), item.get ("url"), item.get ("data"), item.get ("headers"), item.get ("callback")) await asyncio.wait (tasks) async def control_sem (sem, method, url) Data, headers, callback): async with sem: count = 0 flag = False while not flag and count < 4: flag = await fetch (method, url, data, headers, callback) count = count + 1 print ("flag: {}, count: {}" .format (flag Count)) if count = = 4 and not flag: raise Exception ('EAS service not responding after 4 times of retry.') async def fetch (method, url, data, headers, callback): async with aiohttp.request (method, url=url, data=data) Headers=headers) as resp: try: json = await resp.read () print (json) if resp.status! = 200: return False if isfunction (callback): callback (json) return True except Exception as e: print (e)

Here, we encapsulate the request method of sending batch requests to receive the amount of data sent at one time and synthesize the data. When used externally, you only need to build the data of the network request object and set the request pool size. At the same time, the retry feature is set and 4 retries are carried out to prevent the network request of a single data from failing when the network jitters.

Final effect

After using the cooperative program to reconstruct the network request module, when the amount of data is 1000, it is doubled from 816s to 424s, and the size of the request pool is enlarged, the effect is more obvious. Due to the data limitation of establishing connections on third-party platforms at the same time, we set a threshold of 40. As you can see, the degree of optimization is significant.

At this point, I believe you have a deeper understanding of "how to double the processing speed of python". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.