In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/02 Report--
I. introduction
What is concurrency?
The essence of concurrency is switching + saving state.
Cpu is running a task that cuts away to perform other tasks in two cases (switching is forcefully controlled by the operating system):
1. The task is blocked
two。 The computing task takes too long, so you need to give up the cpu to the high priority program.
Co-program, also known as microthread, is a kind of lightweight thread in user mode. The co-program can retain the state of the last call, and each re-entry of the procedure is equivalent to entering the state of the last call, in other words: entering the position of the logical flow at the time of the last departure, when there are a large number of operations in the program that do not require CPU (IO), it is applicable to the co-program.
The cooperative program is essentially a thread. In the past, the switching of thread tasks was controlled by the operating system. When we encounter the automatic switching of Icano, now the purpose of using the cooperative program is to reduce the overhead of operating system switching (switching threads, creating registers, stacks, switching between them, etc.), and control the switching of tasks in our own programs.
The process has three states, and the thread is the smallest unit of execution of the process, so it is also the three states of the thread.
Second, cooperative process switching
1.yield is a way to save the running state of a task under a single thread.
1. Yiled can save state, and the state of yield is similar to the thread state of the operating system, but yield is controlled at the code level and is more lightweight. 2. Send can transfer the result of one function to another to switch between programs in a single thread.
Use yield to switch tasks and save threads:
Import timedef func1 (): for i in range (11): print ('func1 s print'% I) time.sleep (1) def func2 (): G = func1 () for k in range (10): print ('func2 s print'% k) time.sleep (1) print func1 sequentially Func2func2 () yield toggle import timedef func1 (): for i in range (11): yield print ('func1% s print'% I) time.sleep (1) def func2 (): G = func1 () next (g) for k in range (10): print ('func2% s) Print'% k) time.sleep (1) func2 () print only func2 Yield saves the state of func1 Io blocking def consumer ():''Task 1: receive data, process data' 'while True: x=yield # just switch It does not save print ('processed data:', x) def producer ():''Task 2: production data' g=consumer () next (g) # find the yield location for i in range (3): g.send (I) # pass the value to yield, and then loop to the next yield, and there are more switching programs There are a few more steps than direct serial execution, resulting in inefficient execution of print ('sent data:', I) start=time.time () # based on the saved state of yield, the two tasks switch back and forth directly, that is, the concurrency effect producer () # I only executed this function in the current thread But through the send in this function to switch another task stop=time.time () print (stop-start) result: processed the data: 0 sent the data: 0 processed the data: 1 sent the data: 1 processed the data: 2 sent the data: 2 sent the data, simply switched the task, it will reduce the performance of the program.
Note: yield can not detect io and realize automatic switching.
Import time
Def func1 (): while True: print ('func1') yielddef func2 (): G = func1 () for i in range (1000): # I + 1 next (g) time.sleep (3) print (' func2') start = time.time () func2 ( ) stop = time.time () print (stop-start) because the func2 method time.sleep blocks Will switch to func1 execution
The cooperative process is to tell the Cpython interpreter that I have got a GIL lock. Well, I will make a thread for you to execute, saving you time to switch threads. My own switching is much faster than you switching, avoiding a lot of overhead.
For single-threaded programs, io operations are inevitable, but if we can control multiple tasks under a single thread in our own program (that is, at the user program level, not at the operating system level), we can switch to another task to calculate when one task is blocked by io, which ensures that the thread can be in the ready state to the maximum extent, that is, it can be executed by cpu at any time. It is equivalent to hiding our io operations as much as possible at the user program level, so that we can confuse the operating system and let it see that the thread seems to be calculating all the time, with less io, thus assigning more execution rights of cpu to our threads.
Paste the above from other articles
III. Comparison of threads and co-programs
The thread of 1.python belongs to the kernel level, that is, it is scheduled under the control of the operating system (for example, when a single thread encounters io or the execution time is too long, it will be forced to hand over the cpu execution authority and switch to other threads to run) 2. Start a cooperative program in a single thread, and once it encounters io, it will switch from the application level (rather than the operating system) to improve efficiency (!! Switching of non-io operations has nothing to do with efficiency)
Compared with the switch of the operating system control thread, the user controls the switch of the protocol in a single thread.
Advantages:
The switching cost of the co-program is less, it belongs to the program level switching, and the operating system is completely unaware of it, so it can achieve the effect of concurrency in a more lightweight single thread and maximize the use of cpu.
Disadvantages:
1. The cooperative program belongs to single thread, so it can not take advantage of multi-core, so it can be realized by multi-process + multi-thread + cooperative program.
two。 The co-program also runs in a single thread, and once blocked, it will block the entire thread.
Features of the collaborative process:
1. Run under single thread to achieve concurrency 2. Modification of data does not require locking (required by threads) 3. User program control context switch 4. Add: when a co-program encounters an IO operation, it automatically switches to another co-program (how to detect that IO,yield and greenlet cannot be realized, then use the gevent module (select mechanism))
IV. Greenlet
If we have multiple tasks within a single thread, it is too troublesome to use the yield generator to switch between multiple tasks (we need to get the generator initialized once before calling send. It is very troublesome), and it is very easy to switch multiple tasks directly by using greenlet module.
Pip3 install greenletfrom greenlet import greenletdef eat (name): print ('% s eat 1'% name) g2.switch ('Shanghai') print ('% s eat 2'% name) g2.switch () def play (name): print (% s play 1'% name) g1.switch () print ('% s play 2'% name) G1 = greenlet (eat) G2 = Greenlet (play) g1.switch ('beijing') # needs to pass parameters for the first time I don't need it anymore.
Simple switching (without io or repeated operations to open up memory space) will slow down the execution of the program.
# execute import timedef F1 (): res=1 for i in range (100000000): res+=idef f2 (): res=1 for i in range (100000000): res*=istart=time.time () f1 () f2 () stop=time.time () print ('run time is% s'% (stop-start)) # 8.79575610160827 switch from greenletimport greenletimport timedef f1 (): Res=1 for i in range (100000000): res+=i g2.switch () def f2 (): res=1 for i in range (100000000): res*=i g1.switch () start=time.time () g1=greenlet (F1) g2=greenlet (F1) g1.switch () stop=time.time () print ('run time is% s'% (stop-start)) # 45.937793016433716
Greenlet only provides a more convenient switching method than generator (yield). If you encounter IO when cutting to the execution of a task, it will block in place (io can not be recognized). It still does not solve the problem of IO automatic switching to improve efficiency.
The code of multiple tasks in a single thread usually has both computational operations and blocking operations. We can use the blocking time to execute Task 2 when we encounter blocking during task 1 execution. In this way, the efficiency can be improved, which uses the Gevent module.
5. Gevent
Gevent is a third-party library that can easily implement concurrent synchronous or asynchronous programming through gevent. The main mode used in gevent is Greenlet, which is a lightweight protocol connected to Python in the form of C extension modules. Greenlet all run inside the main operating system processes, but they are scheduled in a collaborative manner
Installation:
Pip3 install gevent
Usage:
G1=gevent.spawn (func,1,2,3,x=4,y=5)
# to create a cooperative program object, the first parameter in parentheses is the function name, such as eat, which can be followed by multiple parameters, which can be location arguments or keyword arguments, all of which are passed to the function eat. Spawn submits the task asynchronously.
G2=gevent.spawn (func2)
G1.join () # wait for G1 to finish
G2.join () # when waiting for G2 to finish the test, you will find that G2 can be executed without writing a second join. Yes, the co-program will help you switch execution, but you will find that if the tasks in G2 take a long time to execute, but if you don't write join, you will not finish the remaining tasks in G2.
# or the above two steps cooperate in one step:
Gevent.joinall ([g1meng G2])
G1.value # gets the return value of func1
Import geventdef eat (name): print ('% s eat 1'% name) gevent.sleep (2) print ('% s eat 2'% name) def play (name): print ('% s play 1'% name) gevent.sleep (1) print ('% s play 2'% name) G1 = gevent.spawn (eat, 'xxx') G2 = gevent.spawn (play) Name='xxx') g1.join () g2.join () # or gevent.joinall ([G1 Magi G2]) print ('over')
The above example gevent.sleep (2) simulates io blocking that can be recognized by gevent
While time.sleep (2) or other blocking cannot be directly recognized by gevent, you need to use the following line of code to patch it.
From gevent import monkey
Monkey.patch_all () # must be placed in front of the patched, such as before the time,socket module
From geventimport monkeymonkey.patch_all () # must be written at the top, otherwise ioimport geventimport timedef eat () may not be recognized: # print () print ('eat food 1') time.sleep (2) # add monkey to recognize the sleep of the time module print (' eat food 2') def play (): print ('play 1') time.sleep (1) # switch back and forth Until the end of an I gevent O time, this is all done by us, and it is no longer an uncontrollable operating system. Print ('play 2') G1 = gevent.spawn (eat) G2 = gevent.spawn (play) gevent.joinall ([G1 Magneto G2]) print (' over')
VI. Synchronous and asynchronous
From gevent import spawn, joinall,monkeymonkey.patch_all () import timedef task (pid): time.sleep (0.5) print ('Task% s done'% pid) def sync (): for i in range (10): task (I) def asyncous (): g_list = [spawn (task) I) for i in range (10)] joinall (g_list) if _ _ name__ = ='_ main__': print ('sync') sync () # compare and find the execution speed print (' async') asyncous ()
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.