Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to split Excel and merge Excel in batch by Pandas

2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the relevant knowledge of "how Pandas splits Excel and merges Excel in batches". The editor shows you the operation process through actual cases. The operation method is simple, fast and practical. I hope this article "how to split Excel and merge Excel in batches by Pandas" can help you solve the problem.

1. Demonstration of examples

1. Split a large Excel into multiple Excel

two。 Merge multiple small Excel into one large Excel and mark the source

Work_dir= ". / course_datas/c15_excel_split_merge" splits_dir=f "{work_dir} / splits" import osif not os.path.exists (splits_dir): os.mkdir (splits_dir) II. Read the source Excel to Pandasimport pandas as pddf_source = pd.read_excel (f "{work_dir} / crazyant_blog_articles_source.xlsx") df_source.head () id title tags

0 2585 Tensorflow how to receive variable length list feature python,tensorflow, feature engineering

1 2583 Pandas to achieve data merging concat pandas,python, data analysis

What is the purpose of the Index index of 2 2574 Pandas? Pandas,python, data analysis

3 2564 data sets commonly used in machine learning: python, machine learning

4 2561 A data scientist's training path data analysis df_source.indexRangeIndex (start=0, stop=258, step=1) df_source.shape

(258,3)

Total_row_count = df_source.shape [0] total_row_count

two hundred and fifty eight

Third, split a large Excel into multiple Excel

1. Using the df.iloc method, split a large dataframe into several small dataframe

two。 Each small Excel will be saved using dataframe.to_excel

1. Calculate the number of rows for each excel after the split

# this big excel will be split to these people user_names = ["xiao_shuai", "xiao_wang", "xiao_ming", "xiao_lei", "xiao_bo", "xiao_hong"] # the number of tasks per person split_size = total_row_count / / len (user_names) if total_row_count% len (user_names)! = 0: split_size + = 1split_size

forty-three

2. Split into multiple dataframe

Df_subs = [] for idx, user_name in enumerate (user_names): # iloc start index begin = idx*split_size # iloc end index end = begin+split_size # implementation df splits df_sub = df_ source.ilocs according to iloc [begin: end] # saves each child df in the list df_subs.append ((idx, user_name, df_sub))

3. Save each datafame into excel

For idx, user_name, df_sub in df_subs: file_name = f "{splits_dir} / crazyant_blog_articles_ {idx} _ {user_name} .xlsx" df_sub.to_excel (file_name, index=False) IV. Merge multiple small Excel into one large Excel

1. Traverse the folder to get a list of Excel files to merge

two。 Read to the dataframe separately and add a column to each df to mark the source

3. Batch merge of df using pd.concat

4. Output the merged dataframe to excel

1. Iterate through the folder to get a list of Excel names to merge

Import osexcel_names = [] for excel_name in os.listdir (splits_dir): excel_names.append (excel_name) excel_names

['crazyant_blog_articles_0_xiao_shuai.xlsx'

'crazyant_blog_articles_1_xiao_wang.xlsx'

'crazyant_blog_articles_2_xiao_ming.xlsx'

'crazyant_blog_articles_3_xiao_lei.xlsx'

'crazyant_blog_articles_4_xiao_bo.xlsx'

'crazyant_blog_articles_5_xiao_hong.xlsx']

two。 Read to dataframe respectively

Df_list = []

For excel_name in excel_names: # read each excel to df excel_path = f "{splits_dir} / {excel_name}" df_split = pd.read_excel (excel_path) # get username username = excel_name.replace ("crazyant_blog_articles_", ") .replace (" .xlsx ",") [2:] print (excel_name, username) # add one column to each df That is, user name df_split ["username"] = username df_list.append (df_split)

Crazyant_blog_articles_0_xiao_shuai.xlsx xiao_shuai

Crazyant_blog_articles_1_xiao_wang.xlsx xiao_wang

Crazyant_blog_articles_2_xiao_ming.xlsx xiao_ming

Crazyant_blog_articles_3_xiao_lei.xlsx xiao_lei

Crazyant_blog_articles_4_xiao_bo.xlsx xiao_bo

Crazyant_blog_articles_5_xiao_hong.xlsx xiao_hong

3. Merge using pd.concat

Df_merged = pd.concat (df_list) df_merged.shape

(258,4)

Df_merged.head ()

Id title tags username

0 2585 Tensorflow how to receive variable length list feature python,tensorflow, feature engineering xiao_shuai

1 2583 Pandas to achieve data merging concat pandas,python, data analysis xiao_shuai

What is the purpose of the Index index of 2 2574 Pandas? Pandas,python, data analysis xiao_shuai

3 2564 data sets commonly used in machine learning Daquan python, machine learning xiao_shuai

4 2561 A data scientist's training path data analysis xiao_shuai

Df_merged ["username"] .value_counts ()

Xiao_hong 43

Xiao_bo 43

Xiao_shuai 43

Xiao_lei 43

Xiao_wang 43

Xiao_ming 43

Name: username, dtype: int64

4. Output the merged dataframe to excel

That's all for df_merged.to_excel (f "{work_dir} / crazyant_blog_articles_merged.xlsx", index=False) about "how Pandas splits Excel and merges Excel in batches". Thank you for reading. If you want to know more about the industry, you can follow the industry information channel. The editor will update different knowledge points for you every day.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report