In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article focuses on "how to deal with multiple Excel tables in Python seconds". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Next let the editor to take you to learn "Python seconds to deal with multiple Excel tables" bar!
Background
Xiaoming works for an outdoor sports franchise company, which has many brands and involves many subdivided industries. Xiao Ming works as a data analyst in this company and usually does data analysis through Excel. Today, his boss gave him a task: to select the top five brands and sales of the group company in the last year before leaving work.
For the boss of Excel, isn't this a matter of minutes? Xiao Ming didn't pay attention to it, and it wasn't until his colleagues in the marketing department sent him the original data file that he realized that it wasn't that simple.
This is not an imaginary ranking of the top five. There are a total of 90 files. According to the conventional way of thinking, he either copies the contents of all the files into a table for classification and summary, or classifies and summarizes each table, and then classifies the final results.
Thinking about the workload, and then thinking about the deadline, Xiao Ming scratched his head and felt that he was going to be bald gradually.
Train of thought analysis
Writing a program to solve this kind of manual work is the easiest. At this time, Xiaoming thought of his good friend Xiao Duan, a programmer, so he threw the question to Xiao Duan.
He stranded his hair and said, "so easy, all you need is Master Pan."
Xiao Ming said, "can't you handle it?" We have to find someone else!
Xiao Duan said with a wry smile: no, Master Pan is a data processing library in Python, called Pandas, commonly known as Master Pan.
Xiao Ming said: I don't care what master or not, just say how long it will take to get it done.
Xiao Duan said: give me a few minutes to write the program, and then run for a few seconds.
Xiao Ming sent out the expression of admiration for the boss.
The little paragraph thought about it a little bit and sorted out the idea of the program:
Calculate the sales of each row in each table and use "the number of visitors to convert the number of visitors to the unit price".
Summarize the sales of each form according to the brand.
Summarize the results of all the tables into a general table
Summarize sales by brand and sort them in the general table
Coding
Step zero, read Excel:
Import pandas as pddf = pd.read_excel (". / tables/" + name)
The first step is to calculate the sales in each table:
Df ['sales'] = df ['number of visitors'] * df ['conversion'] * df ['guest unit price']
The second step is to summarize the sales volume of each table by brand:
Df_sum = df.groupby ('brand') ['sales'] .sum () .reset_index ()
The third step is to summarize the results of all the tables into a general table:
Result = pd.DataFrame () result = pd.concat ([result, df_sum])
The fourth step is to summarize and sort the sales by brand in the summary table:
Final = result.groupby ('brand') ['sales'] .sum (). Reset_index (). Sort_values ('sales', ascending=False)
Finally, let's take a look at the complete program:
Import pandas as pdimport osresult = pd.DataFrame () for name in os.listdir (". / tables"): try: df = pd.read_excel (". / tables/" + name) df ['sales'] = df ['number of visitors'] * df ['conversion'] * df ['guest unit price'] df_sum = df.groupby ('brand') ['sales'] .sum (). Reset_index () result = pd.concat ([result Df_sum]) except: print (name) passfinal = result.groupby ('brand') ['sales'] .sum (). Reset_index (). Sort_values ('sales', ascending=False) pd.set_option ('display.float_format', lambda x:% .2f'% x) print (final.head ())
The final result is as follows:
Brand sales 15 brands-5 1078060923.628 brands-17 1064495314.964 brands-13 1038560274.213 brands-12 1026115153.0013 brands-3 1006908609.07
You can see that the final top five have come out, and the whole program runs very fast.
At this point, I believe you have a deeper understanding of the "Python second processing method of multiple Excel tables". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.