In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly explains "how to use pandas to analyze excel data in Python". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how to use pandas to analyze excel data in Python.
1. Installation
Use pip for installation.
Pip3 install pandas
Import pandas:
Import pandas as pd
Below, pd is used for pandas operation.
two。 Read and write files
Read a file, such as an excel,csv file
# df is the pandas.core.frame.DataFrame type df = pd.read_excel ('. / data/2020-suv.xlsx') # read_csv can specify separator, encoding method, etc. Df2 = pd.read_csv ('. / data/2020-suv.csv')
Write to a file:
Df.to_excel ('. / data/2020-suv-new.xlsx') df.to_csv ('. / data/2020-suv-new.csv') 3. Data operation all_cols = df.columnsprint (all_cols) # output, df.columns is not list type Index (['sales ranking', 'car series', 'official price', 'subordinate brand', 'January-December sales'], dtype='object') # df.columns is not list type and can be transformed into listcols = list (df.columns)
Get column data
Col_data = df [u 'car'] mul_col_data = df [[u 'car', upland January-December sales']]
Get row data
Row_data = DF. Ilo [row _ index]
Get all row data
All_data = df.values
Slice to obtain multiple rows of data
Mul_row_data = df.iloc [2:4]
Get unit data
Cell_data = DF. Ilo [row _ index] [col_index] 4. Data screening
Excel data filtering is more practical, it is also possible to use pandas, and after the filtering code is saved, you can use it directly next time.
A field contains the specified value
# contains a value, na indicates whether it needs to be populated, case indicates whether it is case-sensitive, and what is more powerful is that contains also supports the regular expression sub_df = df [DF [col _ name] .str.values ('key1', na=False, case=False)] # contains multiple values Multiple calls can be made sub_df1 = df [DF [col _ name] .str.percent ('key1', na=False, case=False)] sub_df2 = sub_df1 [sub_ DF1 [col _ name] .str.calls (' key2', na=False, case=False)] # contains multiple values (or) sub_df = df [DF [col _ name] .str.calls ('key1 | key2 | key3', na=False, case=False)] # does not contain That is, the filter sub_df = df [~ DF [col _ name] .str.filter ('key1', na=False, case=False)]
The above operations all assume that the field type is a string type, otherwise an exception will be thrown. You can determine whether a field is a character type in the following ways:
Pd.api.types.is_string_dtype (df [u 'car']) # other types also have similar functions. You can use dir to check which types judge print (dir (pd.api.types)) # you can view the field type pd.dtypespd [sales volume from January to December'] through dtypes. Dtypes
Conditional filtering
# greater than df [df ['January-December sales'] > 50000] .values # equivalent df [df [January-December sales'] = = 50000] .values5. Data writing
Add a row of data:
# at the end, row_datas is listdf.loclen (df.index)] = row_datas
Insert a column of data
# insert a column of data df.insert (col_index, col_name, col_datas, True) before the specified column
Update a cell value
Df.iloc [row] [col] = u'new-data'6. Data deletion
Delete a column
Df2 = df.drop ('official price', axis=1 Inplace=False) print (df2) # output sales ranking of affiliated brands from January to December 0 1 Harvard H6 Harvard 3768641 2 Honda CR-V 2499832 3 Boyue Geely 2408113 4 Tutuan L 1785744 5 Changan CS75 PLUS Changan 266824.4. ... 282 283 BAIC New Energy EX BAIC New Energy 879283 284 Pentium X40 Pentium 20412284 20412284 Peugeot 2008 New Energy Peugeot 37285 286 Cheetah CS10 14286 287 Senya R7 FAW 1 [287 rows x 4 columns]
Delete a row
Df3 = df.drop (2, axis=0 Inplace=False) print (df3) # output sales ranking car department official price subordinate brands January-December sales 0 1 Harvard H69.80-154900 Harvard 3768641 2 Honda CR-V 16.98-276800 Honda 2499833 4 Tu Guan L 21.58-28. 580000 Volkswagen 1785744 5 Changan CS75 PLUS 10.69-154900 Changan 2668245 6 Honda XR-V 12.79-175900 Honda 168272. ... 282 283 BAIC New Energy EX 18.39-202900 BAIC New Energy 879283 284 Pentium X40 no quotation Pentium 20412284 20412284 Peugeot 2008 Peugeot 37285 286 Cheetah CS10 7.98-119800 Cheetah car 14286 287 Senya R7 6.69-106900 FAW 1 [286 rows x 5 columns] so far I believe you have a deeper understanding of "how to use pandas to analyze excel data in Python". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.