How to use Python to realize Office Automation 07/06 Update SLTechnology News&Howtos

How to use Python to realize Office Automation

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "how to use Python to achieve office automation", interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn how to use Python to achieve office automation.

There is probably such a Word.

There are nearly 2600 table columns in similar format, each containing information as follows:

Date

Sending unit

Document number

Title

Sign-off bar

The three bold items need to be extracted and stored in the Excel table with the following table style:

That is, you need to fill in the receiving time, file title and document number to the specified location, and at the same time, you need to modify the time to the standard format. If the time is completely manually copied and modified, it can be calculated according to the 10 seconds of an entry. 6 items can be completed in one minute, so you need to do it as soon as possible:

This kind of well-formatted file arrangement is very suitable to be executed with Python, so let Python come out next, and I will present the necessary information as comments in the code.

First import the Word file using Python

# Import the required library docx from docx import Document # specify the path where the file is stored path = rbloc:\ Users\ word.docx' # read the file document = Document (path) # read all the tables in word tables = document.tables

Then divide the problems one by one, first try to get the three required information of the first file entry in the first table.

# get the first table table0 = tables [0]

If you look closely, you can find that a file entry occupies three rows, so you can set the step size to 3 when iterating through all the rows of the table.

Pay attention to the table and parse the required content clearly according to row and cell

# put a variable in the global to count the sequence number n = 0 for i in range (0, len (table0.rows) + 1,3): # date date = table0.cell (I, 1). Text # title title = table0.cell (I + 1,1). Text.strip () # document number dfn = tables.cell (I, 3). Text.strip () print (n, date, tite, dfn)

The next thing that needs to be solved is that the time we get is in the form of a day / month of 2 Universe 1. We need to convert to YYYY-MM-DD format, which makes use of the strptime and strftime functions of the datetime package:

Strptime: parses the time contained in a string

Strftime: convert to the required time format

Import datetime n = 0 for i in range (0, len (table0.rows) + 1,3): # date date = table0.cell (I, 1). Text # some entries are empty Do not make too much discrimination here if'/'in date: date = datetime.datetime.strptime (date,'% dash% m'). Strftime ('2020 copyright% m% d') else: date =' -'# title title = table0.cell (I + 1, 1). Text.strip () # document number dfn = tables.cell (I, 3). Text.strip () print (n, date, tite Dfn)

The parsing of the contents of such a table is completed. Note that table [0] is used here, that is, the first table, traversing all the tables and adding a nested loop. In addition, you can also catch exceptions to increase program flexibility.

N = 0 for j in range (len (tables)): for i in range (0, len (tables [j] .rows) + 1,3): try: # date date = tables [j] .cell (I, 1). Text if'/'in date: date = datetime.datetime.strptime (date Else: date ='-'# title title = tables.cell (I + 1,1). Text.strip () # document number dfn = tables.cell (I 3) .text.strip () n + = 1 print (n, date, title, dfn) except Exception as error: # catch exception You can also use log to write to the log to facilitate viewing and managing print (error) continue

After parsing and obtaining the information, you can export it. The package used is openpyxl.

From openpyxl import Workbook # instantiate wb = Workbook () # get the current sheet sheet = wb.active # set header header = ['serial number', 'delivery time', 'document number', 'document title', 'document number', 'remarks'] sheet.append (header)

Add the following code to the end of the innermost parsing loop

Row = [n, date,'', title, dfn,''] sheet.append (row)

Last remember to save the thread

Wb.save (ritual C:\ Users\ 20200420.xlsx')

The running time is about 10 minutes, and the program execution is over after leaving for a while.

Finally, the complete code is attached. The code is very simple and it is most important to sort out the train of thought.

From docx import Document import datetime from openpyxl import Workbook wb = Workbook () sheet = wb.active header = ['serial number', 'time of receipt', 'document number', 'document title', 'document number', 'remarks'] sheet.append (header) path = ritual C:\ Users\ word.docx' document = Document (path) tables = document.tables n = 0 for j in range (len (tables)): for i in range (0, len [tablesj] .rows) + 1 3): try: # date date = tables [j] .cell (I, 1). Text if'/'in date: date = datetime.datetime.strptime (date) Else: date ='-'# title title = tables.cell (I + 1,1). Text.strip () # document number dfn = tables.cell (I 3) .text.strip () n + = 1 print (n, date, title, dfn) row = [n, date,', title, dfn,'] sheet.append (row) except Exception as error: # catch exception You can also use log to write to the log to facilitate viewing and managing print (error) continue wb.save (ritual C:\ Users\ 20200420.xlsx'). So far, I believe you have a deeper understanding of "how to use Python to achieve office automation". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.