How to use pandas to analyze excel data in Python 07/01 Update SLTechnology News&Howtos

How to use pandas to analyze excel data in Python

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

This article mainly explains "how to use pandas to analyze excel data in Python". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how to use pandas to analyze excel data in Python.

1. Installation

Use pip for installation.

Pip3 install pandas

Import pandas:

Import pandas as pd

Below, pd is used for pandas operation.

two。 Read and write files

Read a file, such as an excel,csv file

# df is the pandas.core.frame.DataFrame type df = pd.read_excel ('. / data/2020-suv.xlsx') # read_csv can specify separator, encoding method, etc. Df2 = pd.read_csv ('. / data/2020-suv.csv')

Write to a file:

Df.to_excel ('. / data/2020-suv-new.xlsx') df.to_csv ('. / data/2020-suv-new.csv') 3. Data operation all_cols = df.columnsprint (all_cols) # output, df.columns is not list type Index (['sales ranking', 'car series', 'official price', 'subordinate brand', 'January-December sales'], dtype='object') # df.columns is not list type and can be transformed into listcols = list (df.columns)

Get column data

Col_data = df [u 'car'] mul_col_data = df [[u 'car', upland January-December sales']]

Get row data

Row_data = DF. Ilo [row _ index]

Get all row data

All_data = df.values

Slice to obtain multiple rows of data

Mul_row_data = df.iloc [2:4]

Get unit data

Cell_data = DF. Ilo [row _ index] [col_index] 4. Data screening

Excel data filtering is more practical, it is also possible to use pandas, and after the filtering code is saved, you can use it directly next time.

A field contains the specified value

# contains a value, na indicates whether it needs to be populated, case indicates whether it is case-sensitive, and what is more powerful is that contains also supports the regular expression sub_df = df [DF [col _ name] .str.values ('key1', na=False, case=False)] # contains multiple values Multiple calls can be made sub_df1 = df [DF [col _ name] .str.percent ('key1', na=False, case=False)] sub_df2 = sub_df1 [sub_ DF1 [col _ name] .str.calls (' key2', na=False, case=False)] # contains multiple values (or) sub_df = df [DF [col _ name] .str.calls ('key1 | key2 | key3', na=False, case=False)] # does not contain That is, the filter sub_df = df [~ DF [col _ name] .str.filter ('key1', na=False, case=False)]

The above operations all assume that the field type is a string type, otherwise an exception will be thrown. You can determine whether a field is a character type in the following ways:

Pd.api.types.is_string_dtype (df [u 'car']) # other types also have similar functions. You can use dir to check which types judge print (dir (pd.api.types)) # you can view the field type pd.dtypespd [sales volume from January to December'] through dtypes. Dtypes

Conditional filtering

# greater than df [df ['January-December sales'] > 50000] .values # equivalent df [df [January-December sales'] = = 50000] .values5. Data writing

Add a row of data:

# at the end, row_datas is listdf.loclen (df.index)] = row_datas

Insert a column of data

# insert a column of data df.insert (col_index, col_name, col_datas, True) before the specified column

Update a cell value

Df.iloc [row] [col] = u'new-data'6. Data deletion

Delete a column

Df2 = df.drop ('official price', axis=1 Inplace=False) print (df2) # output sales ranking of affiliated brands from January to December 0 1 Harvard H6 Harvard 3768641 2 Honda CR-V 2499832 3 Boyue Geely 2408113 4 Tutuan L 1785744 5 Changan CS75 PLUS Changan 266824.4. ... 282 283 BAIC New Energy EX BAIC New Energy 879283 284 Pentium X40 Pentium 20412284 20412284 Peugeot 2008 New Energy Peugeot 37285 286 Cheetah CS10 14286 287 Senya R7 FAW 1 [287 rows x 4 columns]

Delete a row

Df3 = df.drop (2, axis=0 Inplace=False) print (df3) # output sales ranking car department official price subordinate brands January-December sales 0 1 Harvard H69.80-154900 Harvard 3768641 2 Honda CR-V 16.98-276800 Honda 2499833 4 Tu Guan L 21.58-28. 580000 Volkswagen 1785744 5 Changan CS75 PLUS 10.69-154900 Changan 2668245 6 Honda XR-V 12.79-175900 Honda 168272. ... 282 283 BAIC New Energy EX 18.39-202900 BAIC New Energy 879283 284 Pentium X40 no quotation Pentium 20412284 20412284 Peugeot 2008 Peugeot 37285 286 Cheetah CS10 7.98-119800 Cheetah car 14286 287 Senya R7 6.69-106900 FAW 1 [286 rows x 5 columns] so far I believe you have a deeper understanding of "how to use pandas to analyze excel data in Python". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.