How to use Python to read Word table, calculate summary and write to Excel 04/11 Update SLTechnology News&Howtos

How to use Python to read Word table, calculate summary and write to Excel

2025-04-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

This article will explain in detail how to use Python to read the Word table calculation summary and write it into Excel, the content of the article is of high quality, so the editor shares it for you to do a reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.

Preface

It's almost Chinese New year, and it's time for the company to rate at the end of the year. For this year's rating and usual, everyone has to fill in the company's democratic evaluation form, evaluate each colleague, and then the department collects it and makes statistics according to the evaluation form collected. Think of collecting dozens of people's evaluation form, and according to each person's evaluation form to fill in the Excel to calculate statistics to give each person's evaluation, the head is big. Although it is not a difficult task, it is a mindless and meticulous work. It takes more than half a day to get dozens of people's comments, and it's easy to make mistakes if you make too many comments. If so, I think of simply using Python to write a Mini Program to automatically deal with these dirty work, rating to be reviewed every year, can be used every year.

What needs to be done is to read the evaluation form in the word document placed in a folder, process it according to the contents of the table, then summarize all the table data, calculate everyone's score according to the calculation rules, and calculate each person's evaluation according to the score. After summary, it is written into Excel.

There is no denying that it is too convenient to use Python to achieve such things. Life is too short. I use Python.

I use python's docx package to process word, use pandas to process data and write to excel

First, import the package pip install docxpip install pandas

Pandas writes excel depends on the openpyxl package, so it also imports

Pip install openpyxl II. Read the directory file where the evaluation form is located.

Through python's os package, list the files in the folder and identify the .docx files.

Files=os.listdir (filepah) for file in files: if file.find ('.docx') > 0: docfilepah=filepah+file III. Read the word file, process the table data in word data= [] # read the docx comment table file of word And read the tabular data in word def procdoc (docfilepath): document=Document (docfilepath) tables=document.tables table=tables [0] for i in range (1MagneLen (table.rows)): id=int (table.cell (iMagin0) .text) name=table.cell (iMagin1). Text excellent=0 if table.cell (iMagin2). Textbook recording 'and table.cell (I) 2). Text is not None: excellent=1 competent=0 if table.cell (I, 3). Text is not None: competent=1 basicacompetent=0 if table.cell (I, 4). Text is not None: basicacompetent=1 notcompetent = 0 if table.cell (I, 4). 5). And table.cell (I, 5). Text is not None: notcompetent=1 dontunderstand= 0 if table.cell (I, 6). Text is not None: dontunderstand=1 appraisedata= [id,name,excellent,competent,basicacompetent,notcompetent,dontunderstand] data.append (appraisedata).

The statistical calculation of the data is carried out directly through pandas to avoid the traditional cyclic calculation.

Df= pd.DataFrame (data,columns= ['serial number', 'name', 'excellent', 'competent', 'basic competent', 'incompetent','do not know']) df=df.groupby (['serial number', 'name']). Sum () # summarizes each person's score df ['votes'] = df.apply (lambda x: x.sum ()) Axis=1) # count votes df ['score'] = (df ['excellent'] * 95+df ['competent'] * 85+df ['basically competent'] * 75+df ['incompetent'] * 65+df ['do not know'] * 0) / len (df) # score according to rules df ['evaluate'] = df ['score'] .map (getscore) # evaluate ratings according to rules

Scoring method: Democratic evaluation score = Σ votes of each grade * grade score / total number of votes, including "excellent" 95 points, "competent" 85 points, "basic competent" 75 points, "incompetent" 65 points, "do not understand" does not score.

# calculate the rating def getscore (x): if x > = 95: score=' excellent 'elif x > = 80 and x rating 75 and x aggregate calculation data-> write Excel.

The complete code is as follows:

Import osimport pandas as pdfrom docx import Documentdata= [] # read the docx comment table file of word And read the tabular data in word def procdoc (docfilepath): document=Document (docfilepath) tables=document.tables table=tables [0] for i in range (1MagneLen (table.rows)): id=int (table.cell (iMagin0) .text) name=table.cell (iMagin1). Text excellent=0 if table.cell (iMagin2). Textbook recording 'and table.cell (I) 2). Text is not None: excellent=1 competent=0 if table.cell (I, 3). Text is not None: competent=1 basicacompetent=0 if table.cell (I, 4). Text is not None: basicacompetent=1 notcompetent = 0 if table.cell (I, 4). 5). And table.cell (I, 5). Text is not None: notcompetent=1 dontunderstand= 0 if table.cell (I, 6). And table.cell (I, 6). Text is not None: dontunderstand=1 appraisedata= [id,name,excellent,competent,basicacompetent,notcompetent,dontunderstand] data.append (appraisedata) # the directory where the comment table is read And process the docx file in the directory, calculate the score according to the evaluation table, and write it into the summary table. Def readfile (filepah): files=os.listdir (filepah) for file in files: if file.find ('.docx') > 0: docfilepah=filepah+file procdoc (docfilepah) df= pd.DataFrame (data,columns= ['serial number', 'name', 'excellent', 'competent', 'basic competent', 'incompetent', 'unknown']) print (df) df=df.groupby (['serial number') 'name']. Sum () df ['votes'] = df.apply (lambda x: x.sum () Axis=1) df ['scoring'] = (df ['excellent'] * 95+df ['competent'] * 85+df ['basic competence'] * 75+df ['incompetent'] * 65+df ['do not know'] * 0) / len (df) df ['Evaluation'] = df ['scoring'] .map (getscore) print (df) write2excle ('Democratic Review\\ Democratic Evaluation Table Summary .xlsx' Df) # calculate the rating def getscore (x): if x > = 95: score=' excellent 'elif x > = 80 and x rating 75 and x according to the scoring rules

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.