How to merge Excel files with Python 07/12 Update SLTechnology News&Howtos

How to merge Excel files with Python

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

In this article Xiaobian for you to introduce in detail "Python how to achieve the merger of Excel files", detailed content, clear steps, details handled properly, I hope that this "Python how to achieve the merger of Excel files" article can help you solve doubts, the following follow the editor's ideas slowly in-depth, together to learn new knowledge.

I. data merging under a single directory

Merge all files under 2020 into one file:

Import requestsimport jsonimport openpyxlimport datetimeimport datetime as dtimport timeimport pandas as pdimport csvfrom openpyxlimport load_workbookfrom sqlalchemy import create_engineimport mathimport osimport globcsv_list=glob.glob (rudder:\ Python\ 03DataAcquisition\ COVID-19\ 2020\ * .csv') print ("% s"% len (csv_list)) for i in csv_list: fr=open (I, "rb") .read () # except for the first data file Others do not read the header with open ('.. / output/covid19temp0314.csv','ab') as f: f.write (fr) f.close () print ('data synthesis complete!')

Merged data:

Second, use functions for data merging # # 02 use functions for data merging import osimport pandas as pd # define functions (with recursive function) def mergeFile (parent,path= ", pathdeep=0,filelist= [], csvdatadf=pd.DataFrame (), csvdata=pd.DataFrame ()): fileAbsPath=os.path.join (parent) Path) if os.path.isdir (fileAbsPath) = = True: if (pathologically accessible 0 and ('.ipynb _ checkpoints' not in str (fileAbsPath)): # = 0 means there is no lower layer directory print (' -'+ path) for filename2 in os.listdir (fileAbsPath): mergeFile (fileAbsPath,filename2) Pathdeep=pathdeep+1) else: if (pathdeep==2 and path.endswith (".csv") and os.path.getsize (parent+'/'+path) > 0): filelist.append (parent+'/'+path) return filelist# D:\ Python\ 03DataAcquisition\ COVID-19path=input ("Please enter the directory where the data file is located:") filelist=mergeFile (path) filelistcsvdata=pd.DataFrame () csvdatadf=pd.DataFrame () for m in filelist: csvdata=pd.read_csv (m Encoding='utf-8-sig') csvdatadf=csvdatadf.append (csvdata) # because the data for 2023 are not yet available So don't merge.

(* oo) Note: the waiting time for this should be longer, because there are more than 190 million pieces of data.

Save the merged data:

Csvdatadf.to_csv ("covid190314.csv", index=None,encoding='utf-8-sig') csvdatadf=pd.read_csv ("covid190314.csv", encoding='utf-8-sig') csvdatadf.info ()

Read the data of COVID-19 's epidemic situation before 2020amp 0101:

Beforedf=pd.read_csv (ringing D:\ Python\ 03DataAcquisition\ COVID-19\ before20201111.csv',encoding='utf-8-sig') beforedf.info ()

Merge two sets of data:

Tempalldf=beforedf.append (csvdatadf) tempalldf.head ()

3. Processing data from Hong Kong, Macao and Taiwan

As shown in the figure: change Country_Region from Hong Kong to China. The same is true of Macao and Taiwan:

Find data about Taiwan:

Beforedf.loc [beforedf ['Country/Region'] = =' Taiwan'] beforedf.loc [beforedf ['Country/Region'] .str.hammer (' Taiwan')] beforedf.loc [beforedf ['Country/Region'] .str.bread (' Taiwan'), 'Province/State'] =' Taiwan'beforedf.loc [beforedf ['Province/State'] =' Taiwan','Country/Region'] = 'China'beforedf.loc [beforedf [' Province/State'] = = 'Taiwan']

Data processing in Hong Kong:

Beforedf.loc [beforedf ['Country/Region'] .str.hammer (' Hong Kong'), 'Province/State'] =' Hong Kong'beforedf.loc [beforedf ['Province/State'] = =' Hong Kong','Country/Region'] = 'China'afterdf.loc [afterdf [' Country_Region'] .str.hammer ('Hong Kong'),' Province_State'] = 'Hong Kong'afterdf.loc [afterdf [' Province_State'] = = 'Hong Kong','Country_Region'] =' China'

Data processing in Macau:

Beforedf.loc [beforedf ['Country/Region'] .str.hammer (' Macau'), 'Province/State'] =' Macau'beforedf.loc [beforedf ['Province/State'] = =' Macau','Country/Region'] = 'China'afterdf.loc [afterdf [' Country_Region'] .str.hammer ('Macau'),' Province_State'] = 'Macau'afterdf.loc [afterdf [' Province_State'] = = 'Macau','Country_Region'] =' China'

Finally, the sorted data will be saved:

Beforedf.to_csv ("beforedf0314.csv", index=None,encoding='utf-8-sig') afterdf.to_csv ("afterdf0314.csv", index=None,encoding='utf-8-sig')

Read here, this "Python how to achieve the merger of Excel files" article has been introduced, want to master the knowledge of this article also need to practice and use in order to understand, if you want to know more related articles, welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.