In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly introduces "what are the problems we need to pay attention to in concurrent programming". In the daily operation, I believe that many people have doubts about the problems we need to pay attention to in concurrent programming. I have consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubts of "what problems we need to pay attention to in concurrent programming?" Next, please follow the editor to study!
Security problem
I'm sure you've heard descriptions like: this method is not thread-safe, this class is not thread-safe, and so on.
So what is thread safety? In fact, it is essentially correct, and the meaning of correctness is that the program executes as we expect, so don't surprise us. In the last article, "going to the bottom to explore the culprits of concurrent programming Bug-visibility, atomicity, orderliness", we have seen a lot of weird Bug that are beyond our expectations and do not work as we expected.
So how can you write a thread-safe program? In the previous article, I have introduced three main sources of concurrent Bug: atomicity, visibility, and ordering. In other words, in theory, thread-safe programs should avoid atomicity, visibility and order problems.
Does all the code need to be carefully analyzed to see if there are these three problems? Of course not, there is only one situation: there is shared data and the data will change, generally speaking, multiple threads will read and write the same data at the same time. If the data is not shared or the state of the data does not change, the thread security can be ensured. Many technical solutions are based on this theory, such as thread local storage (Thread Local Storage,TLS), immutable mode, and so on. Later, I will describe in detail how the relevant technical solutions are implemented in the Java language.
However, in real life, it is necessary to share data that will change, and there are still many application scenarios.
When multiple threads access the same data at the same time, and at least one thread will write the data, if we do not take protective measures, it will lead to concurrent Bug, which has a technical term called data contention (DataRace). For example, there is an add10K () method in the previous article, where data contention occurs when multiple threads are called, as shown below.
Public class Test {private long count = 0; void add10K () {int idx = 0; while (idx++
< 10000 ) { count += 1; } } } 那是不是在访问数据的地方,我们加个锁保护一下就能解决所有的并发问题了呢?显然没有这么简单。例如,对于上面示例,我们稍作修改,增加两个被 synchronized 修饰的 get()和 set() 方法, add10K() 方法里面通过 get() 和 set() 方法来访问 value 变量,修改后的代码如下所示。对于修改后的代码,所有访问共享变量 value 的地方,我们都增加了互斥锁,此时是不存在数据竞争的。但很显然修改后的 add10K() 方法并不是线程安全的。 public class Test { private long count = 0; synchronized long get() { return count ; 5 } synchronized void set( long v ) { count = v; } void add10K() { int idx = 0; while ( idx++ < 10000 ) { set( get() + 1 ) } } } 假设 count=0,当两个线程同时执行 get() 方法时,get() 方法会返回相同的值 0,两个线程执行 get()+1 操作,结果都是 1,之后两个线程再将结果 1 写入了内存。你本来期望的是 2,而结果却是 1。 这种问题,有个官方的称呼,叫竞态条件(Race Condition)。所谓竞态条件,指的是程序的执行结果依赖线程执行的顺序。例如上面的例子,如果两个线程完全同时执行,那么结果是 1;如果两个线程是前后执行,那么结果就是 2。在并发环境里,线程的执行顺序是不确定的,如果程序存在竞态条件问题,那就意味着程序执行的结果是不确定的,而执行结果不确定这可是个大 Bug。 下面再结合一个例子来说明下竞态条件,就是前面文章中提到的转账操作。转账操作里面有个判断条件——转出金额不能大于账户余额,但在并发环境里面,如果不加控制,当多个线程同时对一个账号执行转出操作时,就有可能出现超额转出问题。假设账户 A 有余额200,线程 1 和线程 2 都要从账户 A 转出 150,在下面的代码里,有可能线程 1 和线程 2同时执行到第 6 行,这样线程 1 和线程 2 都会发现转出金额 150 小于账户余额 200,于是就会发生超额转出的情况。 class Account { private int balance; /* 转账 */ void transfer( Account target, int amt ) { if ( this.balance >Amt) {this.balance-= amt; target.balance + = amt;}
So you can also understand the race conditions as follows. In a concurrency scenario, the execution of the program depends on a state variable, which is similar to the following:
If (the state variable satisfies the execution condition) {execute action}
When a thread finds that the state variable meets the execution condition, it starts to execute the operation; but just when the thread is performing the operation, other threads modify the state variable at the same time, so that the state variable does not meet the execution condition. Of course, in many scenarios, this condition is not explicit. For example, in the previous example of addOne, the compound operation set (get () + 1) implicitly depends on the result of get ().
In the face of data competition and race conditions, how to ensure the safety of threads? In fact, these two kinds of problems can use the technical solution of mutex, and there are many solutions to achieve mutex. CPU provides relevant mutex instructions, and operating systems and programming languages will also provide related API. Logically, we can be classified as locks. In the previous chapters, we also gave a rough introduction to how to use locks. I believe you already have hills in your chest, so I won't repeat them here. You can review the old and know the new in combination with the previous article.
Activity problem
The so-called activity problem means that an operation cannot be carried out. Our common "deadlock" is a typical activity problem, of course, in addition to deadlock, there are two cases, namely "live lock" and "hunger". As you already know from the previous study, threads will wait for each other after a "deadlock" occurs, and will wait forever, which is technically in the form of threads being permanently "blocked".
But sometimes although the thread is not blocked, there is still a situation in which execution cannot be carried out. This is the so-called "live lock". It can be compared to an example in the real world. Passer-by A goes out from the left-hand side, and passerby B enters the door from the right-hand side. in order not to collide with each other, they are modest to each other. Passerby A gives way to the right-hand side, and passerby B also gives way to the left-hand side. as a result, the two collided again. This kind of situation is basically solved by humility for a few times, because people can communicate. But if this happens in the programming world, it is possible to continue to be endlessly "humble" and become a "live lock" that is not blocked but still cannot be executed.
The solution to the "live lock" is very simple, when modest, just try to wait for a random time. For example, in the above example, when passer-by A walks to the left-hand side and finds someone in front of him, he does not immediately change to the right-hand side, but waits for a random time before switching to the right-hand side; similarly, passer-by B does not immediately change routes, but also waits for a random time before switching. Because the waiting time of passerby An and passerby B is random, the probability of colliding again after a simultaneous collision is very low. The "wait for a random time" scheme, though simple, is very effective, and is also used in well-known distributed consistency algorithms such as Raft.
Then how to understand "hunger"? The so-called "hunger" refers to the situation in which the thread cannot continue to execute because it cannot access the resources it needs. If the thread priority is "uneven", if the thread priority is "uneven" and the low-priority thread has little chance of execution in the case of busy CPU, thread "hunger" may occur; if the execution time of the thread with lock is too long, it may also lead to "hunger" problem.
The solution to the "hunger" problem is very simple, there are three solutions: the first is to ensure adequate resources, the second is to allocate resources fairly, and the third is to avoid long-term execution by threads that hold locks. Among the three scenarios, the applicable scenarios of scheme 1 and scheme 3 are limited, because in many scenarios, the scarcity of resources cannot be solved, and the execution time of threads holding locks is difficult to be shortened. However, there are relatively more applicable scenarios for option 2.
So how to allocate resources fairly? In concurrent programming, fair locks are mainly used. The so-called fair lock is a first-come-first-served scheme, in which threads wait sequentially, and the threads in front of the waiting queue will give priority to resources.
Performance problem
Use locks very carefully, but if you are too careful, there may be "performance problems". Excessive use of locks may lead to a wide range of serialization, so that we can not take advantage of multithreading, and the reason why we use multithreading to do concurrent programs is to improve performance.
So we need to minimize serial, so what is the impact of serial on performance? Assuming that the serial percentage is 5%, how much faster can we use multi-core multi-thread compared to single-core single-thread?
There is an Amdahl law, which represents the ability of processors to improve efficiency after parallel computing, which can solve this problem. The specific formula is as follows:
The n in the formula can be understood as the number of cores of CPU, p can be understood as the parallel percentage, and then (1MIP) is the serial percentage, which is 5% of what we assume. Let's assume that the number of kernels (that is, n) of CPU is infinite, then the limit of speedup S is 20. In other words, if our serialization rate is 5%, then no matter what technology we use, we can only improve performance by 20 times at most.
So be sure to pay attention to the impact on performance when using locks. So how can we avoid the performance problems caused by locks? The problem is complicated, and a large part of the reason why there are so many things in Java SDK concurrent packages is to improve performance in a particular area.
However, from the scheme level, we can solve this problem in this way.
First, since using locks can cause performance problems, the best solution is to use lock-free algorithms and data structures. There are many related technologies in this area, such as thread local storage (Thread Local Storage, TLS), write-time replication (Copy-on-write), optimistic locking, etc.; the atomic class in the Java concurrent package is also a lock-free data structure; Disruptor is a lock-free memory queue with very good performance.
Second, reduce the lock holding time. Mutexes essentially serialize parallel programs, so to increase parallelism, you must reduce the time it takes to hold the lock. There are many specific implementation techniques for this solution, such as using fine-grained locks. A typical example is ConcurrentHashMap in the Java concurrent package, which uses the so-called segmented locking technology (which we will describe in detail later); you can also use read-write locks, that is, read-write locks are unlocked and mutually exclusive only when writing.
There are many performance metrics, and I think there are three very important metrics: throughput, latency, and concurrency.
Throughput: refers to the number of requests that can be processed per unit time. The higher the throughput, the better the performance.
Delay: refers to the time between the request being made and the response received. The smaller the delay, the better the performance.
Concurrency: refers to the number of requests that can be processed at the same time. Generally speaking, as the concurrency increases, the delay increases. Therefore, the indicator of delay is generally based on concurrency. For example, when the concurrency is 1000, the delay is 50 milliseconds.
Summary
Concurrent programming is a complex technical field, which involves atomicity, visibility and order at the micro level, while security, activity and performance problems at the macro level.
When we design concurrent programs, we mainly start from a macro point of view, that is, we should focus on its security, activity and performance. In terms of security, we should pay attention to data competition and race conditions, and in terms of activity, we need to pay attention to deadlocks, live locks, hunger and other problems. Although we have introduced two solutions in terms of performance, you still need to analyze specific problems and choose appropriate data structures and algorithms according to specific scenarios.
At this point, the study of "what problems we need to pay attention to in concurrent programming" is over. I hope to be able to solve everyone's doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.