In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article focuses on the "Python Web interface optimization method tutorial", interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Now let the editor to take you to learn the "Python Web interface optimization method tutorial"!
Background
A business platform we are in charge of once found that the loading of the settings page was particularly slow, which was simply outrageous.
It is certainly impossible to keep users waiting for 36 seconds, so we are about to start the journey of optimization.
throw a stone to clear the road
Since it is the response problem of the website, we can use Chrome as a powerful tool to help us find the direction of optimization quickly.
Through the Network of Chrome, you can see not only the time consumed by API requests, but also the allocation of time. Select an item with less configuration and simply request to have a look:
Although it is only a project with three records, it takes 17s to load the project settings. Through Timing, you can see that the total request time is 17.67s, but 17.57s is in the Waiting (TTFB) state.
TTFB is the abbreviation of Time to First Byte, which refers to the time when the browser begins to receive the server response data (background processing time + redirection time). It is an important indicator of server response speed.
Profile Flame Diagram + Code tuning
Then you can probably know that the general direction of optimization is on the back-end interface processing, and the back-end code is implemented by Python + Flask. Do not guess blindly, but go directly to Profile:
The first wave of optimization: functional interaction redesign
To be honest, it's desperate to see this code: nothing at all. Just see a lot of gevent and Threading, because there are too many collaborators or threads?
At this time, be sure to combine the code to analyze (for the sake of short space, the parameter part uses "." Replace):
Describe def get_max_cpus (project_code, gids): ". # define a function to get cpu def get_max_cpu (project_setting, gid, token, headers): group_with_machines = utils.get_groups (...) Hostnames = get_info_from_machines_info (...) Res = fetchers.MonitorAPIFetcher.get (...) Vals = [round (100-val, 4) for ts Val in res ['series'] [0] [' data'] if not utils.is_nan (val)] maxmax_val = max (vals) if vals else float ('nan') max_ Cpuss [gid] = max_val # start thread batch request for gid in gids: t = Thread (target=get_max_cpu, args= (...)) Threads.append (t) t.start () # Recycling thread for t in threads: t.join () return max_cpus
As you can see from the code, in order to get all the cpu_max data from gids more quickly, each gid is assigned a thread to request, and finally the maximum value is returned.
There are two problems here:
Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community
It is costly to create and destroy threads in a web api, because interfaces will be triggered frequently and thread operations will occur frequently. Thread pools should be used as much as possible to reduce system cost.
The request is to load the maximum CPU value of a machine under a gid in the past 7 days. You can simply pat your head and think that this value is not a real-time value or a mean value, but a maximum value. In many cases, it may not be as valuable as you might think.
Now that you know the problem, there is a targeted solution:
Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community
Adjust the function design. Instead of loading the maximum CPU by default, the user clicks to load (first, to reduce the possibility of concurrency, second, it will not affect the whole).
Because of the adjustment of 1, the multithreaded implementation is removed.
Let's take a look at the first wave of optimized flame pictures:
Although there is still a lot of room for optimization in the flame map, at least it looks a little normal.
The second wave optimization: Mysql operation optimization processing
Let's zoom in on the flame diagram from the page tag (interface logic):
We can see that a large number of operations are database operation hotspots caused by the function utils.py:get_group_profile_settings.
By the same token, code analysis is also required:
Def get_group_profile_settings (project_code Gids): # get the Mysql ORM Operand ProfileSetting = unpurview (sandman.endpoint_class ('profile_settings')) session = get_postman_session () profile_settings = {} for gid in gids: compound_name = project_code +':'+ gid result = session.query (ProfileSetting). Filter (ProfileSetting.name = = compound_name). First () if result: resultresult = result.as_dict () tag_indexes = result.get ('tag_indexes') profile_ settling [GID] = {' tag_indexes': tag_indexes 'interval': result ['interval'],' status': result ['status'],' profile_machines': result ['profile_machines'],' thread_settings': result ['thread_settings']}... (omit) return profile_settings
When you see Mysql, the first reaction is the index problem, so it is a priority to look at the index of the database. If there is an index, it should not be a bottleneck:
It is strange that there is already an index here, why the speed is still like this!
Just when I had no clue, I suddenly remembered that in the first wave of optimization, I found that the more gid (group), the more obvious the impact. Then I looked back at the above code and saw that sentence:
For gid in gids:...
I seem to understand something.
Here every gid queries the database once, and projects often have 20-50 + groups, which must have exploded directly.
In fact, Mysql is a query that supports multiple values in a single field, and there is not much data in each record. I can try to use Mysql's OR syntax to avoid not only multiple network requests, but also the damn for.
Just when I was thinking that it was not too late to get things started, Yu Guang caught a glimpse of another place that could be optimized in the code just now, and that is:
See here, familiar friends will probably understand what is going on.
The GetAttr method is used by Python to get the methods / properties of an object, although it must be used, but there will be some performance loss if it is used too frequently.
Combined with the code, let's see:
Def get_group_profile_settings (project_code Gids): # get the Mysql ORM Operand ProfileSetting = unpurview (sandman.endpoint_class ('profile_settings')) session = get_postman_session () profile_settings = {} for gid in gids: compound_name = project_code +':'+ gid result = session.query (ProfileSetting). Filter (ProfileSetting.name = = compound_name). First ().
In this for that has been traversed many times, session.query (ProfileSetting) is repeatedly invalid, and then the property method filter is also frequently read and executed, so it can also be optimized here.
To sum up, the question is:
1. There are no batch queries for database queries.
2. Too many objects in ORM are generated repeatedly, resulting in performance loss
3. The attribute is not reused after reading, resulting in frequent getAttr in the loop with a large number of traverses, and the cost is magnified.
Then the right remedy to the case is:
Def get_group_profile_settings (project_code Gids): # get Mysql ORM operands ProfileSetting = unpurview (sandman.endpoint_class ('profile_settings')) session = get_postman_session () # batch query and lift filter out of the loop query_results = query_instance.filter (ProfileSetting.name.in_ (project_code +':'+ gid for gid in gids)). All () # Profile_settings = {} for result in query_results: if not result: continue resultresult = result.as_dict () gid = result ['name'] .split (':') [1] tag_indexes = result.get ('tag_indexes') profile_ settling [gid] = { 'tag_indexes': tag_indexes 'interval': result ['interval'],' status': result ['status'],' profile_machines': result ['profile_machines'],' thread_settings': result ['thread_settings']}... (omit) return profile_settings
Optimized flame diagram:
Compare the flame diagram of the same position before optimization:
Obvious optimization points: the pre-optimization, bottom-most utils.py:get_group_profile_settings and database-related hotspots have been greatly reduced.
Optimization effect
The response time of the API of the same project is optimized from 37.6 s to 1.47 s. The screenshot shows:
Optimization summary
As a famous saying goes:
If a data structure is good enough, it doesn't need a good algorithm.
When optimizing the function, the fastest optimization is: remove that feature!
The second is to adjust the frequency or complexity of that function!
From top to bottom, from the user use scene to consider this function optimization way, it will often bring more simple and efficient results, hey!
Of course, many times we can't be so lucky. If we really can't get rid of or adjust it, then we can give full play to the value of being programmers: Profile
For Python, try: cProflile + gprof2dot
For Go, you can use: pprof + go-torch
Most of the time, the code problems you see are not necessarily the real performance bottleneck, you need to combine tools to analyze objectively, so that you can effectively hit the pain point!
In fact, this 1.47s is not the best result, there can be more room for optimization, such as:
The way of front-end rendering and rendering, because the whole table has a lot of data assembled before rendering, slow response cells can display chrysanthemums by default, and the data will be updated after return.
The flame diagram shows that there are many details that can be optimized to replace the external interface for requesting data, such as optimizing thoroughly the logic related to GetAttr.
Even more extreme is to convert Python to GO directly.
But these optimizations are no longer so urgent, because this 1.47s is the optimization result of relatively large projects, and most of the projects can be returned in less than 1 second.
Re-optimization may cost more, and the result may only be from 500ms to 400ms, the result is not so cost-effective.
Therefore, we must always be clear about our goal of optimization, always consider the input-output ratio, and make a higher value in the limited time (if we have free time, of course we can do it to the end)
At this point, I believe you have a deeper understanding of the "Python Web Interface Optimization method tutorial". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.