How to split / merge Excel in batches in Pandas data Analysis 07/19 Update SLTechnology News&Howtos

How to split / merge Excel in batches in Pandas data Analysis

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article is about how to split / merge Excel in batches in Pandas data analysis. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

I. falsifying data

Work_dir= ". / datas" splits_dir=f "{work_dir} / splits" import osif not os.path.exists (splits_dir): os.mkdir (splits_dir) # 0. Read source Excel to Pandasimport pandas as pddf_source=pd.read_excel (f "{work_dir} / 1.xlsx") df_source.head () df_source.indexdf_source.shapetotal_row_count=df_source.shape [0] total_row_count

2. Program demonstration 1. Split a large Excel into multiple Excel

Using the df.iloc method, split a large dataframe into several small dataframe

Each small Excel will be saved using dataframe.to_excel

# 1. Calculate the number of rows per excel after split # this big excel Will be split to these people user_names= ['xiao_shuai', "xiao_wang", "xiao_ming", "xiao_lei", "xiao_bo", "xiao_hong"] # number of people per person split_size=total_row_count//len (user_names) if total_row_count%len (user_names)! = 0: split_size+=1split_size# is split into multiple dataframedf_subs= [] for idx User_name in enumerate (user_names): # iloc start index begin=idx*split_size # iloc end index end=begin+split_size # implementation df splits df_sub=df_ source.ilocbegin according to iloc [: end] # stores each child df in the list df_subs.append ((idx,user_name,df_sub)) # 3. Save each dataframe to excelfor idx,user_name,df_sub in df_subs: file_name=f "{splits_dir} / articles_ {idx} _ {user_name} .xlsx" df_sub.to_excel (file_name,index=False) 2, merge multiple small Excel into one big Excel

Traverse the folder to get a list of Excel files to merge

Read to the dataframe separately and add a column to each df to mark the source

Batch merge of df using pd.concat

Output the merged dataframe to excel

# 1. Traverse the folder Get the list of Excel names to be merged import osexcel_names= [] for excel_name in os.listdir (splits_dir): excel_names.append (excel_name) excel_names#2 read to dataframedf_list= [] for excel_name in excel_names: # read each excel to df excel_path=f "{splits_dir} / {excel_name}" df_split=pd.read_excel (excel_path) # to get username username=excel_name.replace ("articles_") ") .replace (" .xlsx ",") [2:] print (excel_name,username) # add 1 column to each df That is, user name df_split ["username"] = username df_list.append (df_split) # 3. Use pd.concat to merge df_merged=pd.concat (df_list) df_merged.shapedf_merged.head () df_merged ["username"] .value_counts () # 4. Output the merged dataframe to exceldf_merged.to_excel (f "{work_dir} / result_merged.xlsx", index=False)

Thank you for reading! On "Pandas data analysis of how to split / merge Excel in batches" this article is shared here, I hope the above content can be of some help to you, so that you can learn more knowledge, if you think the article is good, you can share it out for more people to see it!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.