Performance optimization guide: general principles and methods of performance optimization 07/12 Update SLTechnology News&Howtos

Performance optimization guide: general principles and methods of performance optimization

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

[this article is transferred from the author of the blog Garden: xybaby original link: https://www.cnblogs.com/xybaby/p/9055734.html]

As a programmer, performance optimization is a common thing, whether it is desktop applications or web applications, whether front-end or back-end, whether single-point applications or distributed systems. This paper considers this problem from the following aspects: the general principle of performance optimization, the level of performance optimization, and the general method of performance optimization.

However, due to personal experience, it may be more from the perspective of the Linux server to think about these issues.

General principle

Based on the data rather than guessing

This is the first principle of performance optimization. when we suspect that there is a problem with performance, we should analyze where there is a problem through testing, log, and profillig, rather than by feeling and luck. If a system has a performance problem, the bottleneck may be CPU, memory, or IO (disk IO, network IO). The general orientation can be located by using top and stat series (vmstat,iostat,netstat...). For a single process, you can use pidstat to analyze.

In this article, we mainly discuss the performance issues related to CPU. According to the 80 profile 20 law, most of the time is spent in a small number of code snippets, and the only reliable way to find this code is profile. All programming languages I know have related profile tools, and proficiency in using these profile tools is the first step in performance optimization.

Avoid premature optimization

The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.

I am not quite clear about the context in which Donald Knuth made this famous quote, but I quite agree with this idea. In my working environment (and typical Internet application development) and programming mode, what I pursue is fast iteration and trial and error, and premature optimization is often futile. Moreover, premature optimization is easy to pat on the head, and the optimization point is often not the real performance bottleneck.

Avoid excessive optimization

As performance is part of the specification of a program-a program that is unusably slow is not fit for purpose

The goal of performance optimization is to pursue the appropriate performance-to-price ratio.

At different stages, we will have certain requirements for the performance of the system, such as how much throughput we want to achieve. If it can meet expectations, then there is no need to spend time and energy to optimize, such as the internal system used by only a few dozen people, do not have to optimize according to the goal of 100,000 online.

Also, as mentioned later, some optimization methods are "detrimental" and may have side effects on the readability and maintainability of the code.

In-depth understanding of the business

The code serves the business, perhaps to the end user, or to other programmers. Do not understand the business, it is difficult to understand the process of the system, it is difficult to find out the shortcomings of the system design. The importance of understanding the business will be mentioned later.

Performance optimization is a protracted war.

When the core business direction is clear, we should start to pay attention to performance issues, and when the project is online, we should continue to carry out performance testing and optimization.

Now the Internet products, is no longer an one-shot sale, after the launch also needs continuous development, the influx of users will also bring performance problems. Therefore, it is necessary to automatically detect performance problems, maintain a stable test environment, and continuously find and solve performance problems, rather than passively waiting for user complaints.

Select appropriate metrics, test cases, and test environment

Because performance optimization is a long-term behavior, it is necessary to fix measurement indicators, test cases and test environment, so as to objectively reflect the actual situation of performance and show the effect of optimization.

Different system core indicators are different, first of all, we should make clear the core performance requirements of the system, fixed test cases; secondly, we should also take into account other indicators, can not take care of one at the expense of the other.

The test environment is also very important. Once we suddenly found that our QPS was much higher, but the program was not optimized at all. After checking for a long time, we found that we had changed a more powerful physical machine to do the test server.

The level of performance optimization

According to my understanding, it can be divided into requirements stage, design stage and implementation stage; the higher the stage is, the more obvious the optimization effect is, and at the same time, it needs more in-depth understanding of the business and requirements.

Demand stage

He who defeats others without fighting is the one who is good.

The programmer's requirements may come from the business requirements (or functional requirements) of PM or UI, or from the requirements of Team Leader. When we get a requirement, the first thing we need is to think about and discuss the rationality of the requirement, not to design and code it right away.

Demand is to solve a problem, the problem is the essence, demand is the means to solve the problem. So programmers have to think for themselves whether the requirements can really solve the problem. As mentioned in the previous article, a requirement of a product manager (especially a product manager who knows a little technology) may only be a solution to a problem. He thought that this method could solve his problem, so he regarded the solution as a requirement, not a real problem.

The premise of requirements discussion is an in-depth understanding of the business. If you do not understand the business, it is impossible to discuss it at all. Even if the requirements have been implemented, when we find that there is a performance problem, we can start from the requirements first.

How does requirement analysis help performance optimization? first, in order to achieve the same goal and solve the same problem, there may be a better performance (less consumption) approach. This kind of optimization is lossless, that is, it can achieve the effect of performance optimization without changing the nature of requirements; in the second case, lossy optimization, that is, without significantly affecting the user's experience, slightly modify the requirements and relax the conditions. can greatly solve the performance problem. PM takes a small step back and the program takes a big step forward.

Requirements discussion also helps to be more scalable at design time to cope with future demand changes, which is not shown here.

Design phase

Experts spend 80% of their time thinking and 20% of their time implementing; novices write code quickly, but there is an endless stream of bug fixes.

The concept of design is very broad, including architecture design, technology selection, interface design and so on. The architecture design restricts the expansion of the system and the technology selection determines the code implementation. Programming languages and frameworks are tools, and different systems and businesses need to choose the appropriate tool set. If the design is not good enough, it will be difficult to optimize later, or even need to be pushed to start all over again.

Realization stage

Implementation is the process of translating the function into code, and the optimization at this level is mainly aimed at the optimization of a calling process, a function, and a piece of code. Various profile tools also take effect mainly at this stage. In addition to static code optimization, there are compile-time optimization and run-time optimization. The latter two are very demanding, and the programmers are less controllable.

At the code level, performance bottlenecks are usually caused by frequently called functions, or functions that are very expensive at a single time, or a combination of both.

The following describes the optimization methods for the design phase and the implementation phase.

General method

Caching

There is no performance problem that cannot be solved by caching, if so, add another level of caching

A cache / k??/ KASH, [1] is a hardware or software component that stores data so future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation, or the duplicate of data stored elsewhere.

The essence of caching is to accelerate access, and the accessed data is either a copy of other data-- bringing the data closer to the user, or the result of previous calculations-- to avoid double counting.

Cache needs to exchange space for time. In the case of limited cache space, excellent replacement conversion is needed to ensure a high hit rate of cache.

Caching of data

This is our most common form of caching, caching data closer to the user. For example, CPU cache and disk cache in the operating system. For a web application, the front end will have browser cache, CDN, and static content cache provided by reverse proxy, while the back end will have local cache and distributed cache.

Data caching is often considered at the design level.

For data caching, cache consistency needs to be considered. For scenarios with strong consistency requirements in distributed systems, feasible solutions are lease, version number.

Caching of calculation results

For expensive calculations, you can cache the results and use them directly next time.

We know that an effective way to optimize recursive code is to cache the intermediate result, lookup table, to avoid double counting. This is the idea of method cache in python.

For objects that may be repeatedly created, destroyed, and costly to create and destroy, such as processes and threads, they can also be cached, such as singletons and resource pools (connection pools, thread pools).

For the cache of calculation results, we also need to consider the cache invalidation. For pure function, the fixed input has a fixed output, and the cache will not fail. However, if the calculation is affected by intermediate states and environment variables, then the cached results may fail, such as the python method cache I mentioned earlier.

Concurrence

If one person can't finish the work, find two people to do it. Concurrency not only increases the throughput of the system, but also reduces the average waiting time of users.

Concurrency here refers to concurrency in a broad sense, and the granularity includes multi-machine (cluster), multi-process, and multi-thread.

For stateless services (state refers to the context that needs to be maintained, and user requests depend on these contexts), clustering can scale well and increase system throughput, such as web server after nginx is mounted.

For stateful services, there are also two forms: each node provides the same data, such as the read-write separation of mysql; each node provides only part of the data, such as sharding in mongodb

In distributed storage systems, both partition (sharding) and replication (backup) contribute to concurrency.

The vast majority of web server, either use multi-processes, or use multi-threads to handle user requests, in order to make full use of multi-core CPU, and where there is IO blocking, it is also suitable to use multi-threads.

Inertia

Deferring the calculation to the necessary moment is likely to avoid unnecessary calculation, or even no calculation at all, which was given many examples in the previous article "lazy ideas in programming".

The idea of CopyOnWrite is really powerful.

Batch, merging

When there is IO (Network IO, disk IO), merge operation and batch operation can often improve throughput and performance.

The most common thing we do is read in batches: read more each time you read the data for a rainy day. For example, GFS client will read more chunk information from GFS master; for example, in a distributed system, if a centralized node has a complex global ID generation, our application can request a batch of id at a time.

Especially when there is a single point in the system, the cache and batch essentially reduce the interaction with the single point, which is an economical and effective way to reduce the pressure of the single point.

In front-end development, there is often the compression and consolidation of resources, and this is also the idea.

When it comes to network requests, the network transmission time may be much longer than the request processing time, so it is necessary to merge network requests, such as the pipeline of mognodb's bulk operation,redis. You can also write files in batches to reduce IO overhead, as is done in GFS

More efficient implementation

The same algorithm, there must be different implementations, then there will be different performance; some implementations may be time for space, some implementations may be space for time, so you need to weigh according to your own actual situation.

Programmers love early wheels, and it's understandable to use them for practice, but in projects, using mature, proven wheels tends to perform better than self-made wheels. Of course, whether you use other people's wheels or your own tools, when there is a performance problem, you can either optimize it or replace it.

For example, we have a scenario where there are a large number of complex serialization and deserialization of nested objects. At the beginning, we use the json module that comes with python (Cpython). Even if we find a performance problem, we cannot optimize it. Look it up on the Internet and replace it with ujson. The performance is much better.

The above example is lossless, but some more efficient implementations may also be lossy. For example, for python, if a performance problem is found, then C extensions are likely to be considered, but it will also lead to the loss of maintainability and flexibility and the risk of crash.

Reduce the solution space

Narrowing the solution space means calculating within a smaller range of data rather than traversing all the data. The most common is the index, through the index, can quickly locate the data, the optimization of the database is the optimization of the index most of the time.

If there is a local cache, the use of indexes can also greatly speed up access. However, the index is more suitable for reading more and writing less. After all, the construction of the index also requires consumption.

In addition, in the game server, the dividing line and AOI (lattice algorithm) are also methods to reduce the solution space.

Performance Optimization and Code quality

In many cases, good code is also efficient code, and there is a similar book "effective xx" in all languages. For example, the code for python,pythonic is usually efficient, such as using iterators instead of lists (python2.7 dict's iteritems () instead of items ()).

The criteria for measuring code quality are readability, maintainability and extensibility, but performance optimization may violate these characteristics. For example, in order to shield implementation details and usage, we may add the interface layer (virtual layer). This readability, maintainability, scalability will be much better, but an additional layer of function calls, if this place calls frequently, then it is also an overhead As mentioned earlier, the C extension also reduces maintainability,

This optimization, which is detrimental to the quality of the code, should be left to the end, with clear comments and documentation at the same time.

In order to pursue extensibility, we often introduce some design patterns, such as state pattern, policy pattern, template method, decorator pattern, etc., but these patterns are not necessarily performance-friendly. So, for the sake of performance, we may write some anti-pattern, customized, less elegant code that is actually fragile, and a little change in requirements can have a vital impact on the logic of the code. so let's go back to what I said earlier, don't optimize too early, don't over-optimize.

Summary

Let's take a brain map and summarize it.

The copyright of this article belongs to the author xybaby (blog address: http://www.cnblogs.com/xybaby/)).

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.