Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to operate Excel, Word and CSV with Python

2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly explains "how to use Python to operate Excel, Word, CSV". The content in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "how to use Python to operate Excel, Word, CSV".

Python operation Excel

Common tools

Data processing is a major application scenario of Python, and Excel is the most popular data processing software at present. Therefore, when using Python for data processing, it is easy to deal with Excel. Thanks to the hard work of our predecessors, Python has many ready-made wheels for dealing with Excel, such as xlrd & xlwt & xlutils, XlsxWriter, OpenPyXL, and the open interface of Microsoft Excel can be called directly on the Windows platform. These are common tools, and some other excellent tools will not be introduced here. Next, we will show the characteristics of each tool through a table:

Type xlrd&xlwt&xlutilsXlsxWriterOpenPyXLExcel open interface read support write support modify whether support xls support do not support high version of xlsx support large files do not support efficiency fast fast super slow function is weak and generally super powerful

The above can choose the appropriate tools according to different needs. Now we mainly introduce the use of the most commonly used xlrd & xlwt & xlutils series tools.

Introduction to xlrd & xlwt & xlutils

Xlrd&xlwt&xlutils consists of the following three libraries:

Xlrd: for reading Excel files

Xlwt: for writing to Excel files

Xlutils: utilities for manipulating Excel files, such as copying, splitting, filtering, etc.

Installation library

The installation is relatively simple, you can install three libraries directly with the pip tool. The installation commands are as follows:

$pip install xlrd xlwt xlutils

Write to Excel

Next, let's start by writing Excel, and let's just look at the code as follows:

# Import xlwt library import xlwt # create xls file object wb = xlwt.Workbook () # add two new form pages sh2 = wb.add_sheet ('grade') sh3 = wb.add_sheet ('summary') # then add data by location, the first parameter is line The second parameter is column # to write the first sheet sh2.write (0,0, 'name') sh2.write (0,1, 'professional') sh2.write (0,2, 'subjects') sh2.write (0,3, 'grades') sh2.write (1,0, 'Zhang San') sh2.write (1,1, 'Information and Communications Engineering') sh2.write (1,2 Sh2.write (1,3,88) sh2.write (2,0, Li Si) sh2.write (2,1, 'Internet of things Engineering') sh2.write (2,2, 'Digital signal processing Analysis') sh2.write (2,3,95) sh2.write (3,0, 'Wang Hua') sh2.write (3,1, 'Electronics and Communications Engineering') sh2.write (3,2 'fuzzy mathematics') sh2.write (3,3,90) # write the second sheet sh3.write (0,0, 'total score') sh3.write (1,0273) # the last save file can be wb.save ('test.xls')

Run the code, and you will see that an Excel file named test.xls is generated. Open the file and see as shown in the following figure:

The above is the code to write Excel, is it very simple, let's take a look at how to read Excel.

Read Excel

It is not difficult to read Excel. Please see the following code:

# Import xlrd library import xlrd # Open the test_w.xls file wb = xlrd.open_workbook ("test_w.xls") we just wrote # get and print the number of sheet print ("sheet quantity:", wb.nsheets) # get and print the sheet name print ("sheet name:" Wb.sheet_names () # get the content according to the sheet index sh2 = wb.sheet_by_index (0) # or # you can also get the content according to the sheet name # sh = wb.sheet_by_name ('grades') # get and print the sheet rows and columns print (u "sheet% s% d rows% d columns"% (sh2.name, sh2.nrows) Sh2.ncols) # get and print the value of a cell print ("the value of the first row and the second column is:", sh2.cell_value (0,1)) # get the value of the whole row or column rows = sh2.row_values (0) # get the first row content cols = sh2.col_values (1) # get the second column content # print the row column value print ("the value of the first row is:" Rows) print ("the value of the second column is:", cols) # gets the data type of the cell content print ("the value type of the second row and the first column is:", sh2.cell (1,0) .ctype) # iterates through all the form contents for sh in wb.sheets (): for rin range (sh.nrows): # outputs the specified line print (sh.row (r))

Output the following results:

A careful friend may notice that here we can get the type of the cell. When we read the type above, we get the number 1. What type does that 1 represent and what type does it have? Don't worry, let's show it through a table:

Numeric type description 0empty empty 1string string 2number numeric 3date date 4boolean Boolean value 5error error

From the table above, we can see that the number 1 returned by the cell type just got corresponds to the string type.

Modify excel

The above said to write and read Excel content, and then we will talk about how to update and modify the Excel, which requires the methods in the xlutils. Go straight to the code and take a look at the simplest modification:

# Import the corresponding module import xlrd from xlutils.copy import copy # Open the excel file readbook = xlrd.open_workbook ("test_w.xls") # copy a copy of wb = copy (readbook) # Select the first form sh2 = wb.get_sheet (0) # add write data sh2.write (4,0, 'Wang Huan') sh2.write (4, 1, 'Communication Engineering') sh2.write (4, 2) in the fifth line Machine Learning) sh2.write (4,3,89) # Select the second form sh2 = wb.get_sheet (1) # replace the total score data sh2.write (1,0,362) # Save wb.save ('test.xls')

As can be seen from the above code, the modified Excel here is to make a copy of the original Excel through the copy method of the xlutils library, and then modify it, and then save it. Take a look at the modification results as follows:

Format conversion operation

In normal times, when we use Excel, we will format or style the data. Here, simply modify the code described above to change the format of the output slightly. The code is as follows:

# Import xlwt library import xlwt # set write format red bold styleBR = xlwt.easyxf ('font: name Times New Roman, color-index red, bold on') # set numeric format to decimal point and retain two digits styleNum = xlwt.easyxf (num_format_str='# # # 0.00') # set the date format to display as YYYY-MM-DD styleDate = xlwt.easyxf (num_format_str='YYYY-MM-DD') # create xls file object wb = xlwt.Workbook () # add two new form pages sh2 = wb.add_sheet ('grade') sh3 = wb.add_sheet ('summary') # then add the data by location, the first parameter is the line The second parameter is the column sh2.write (0,0, 'name', styleBR) # sets the header font to red bold sh2.write (0,1, 'date', styleBR) # sets the header font to red bold sh2.write (0,2, 'grades', styleBR) # sets the header font to red and bold # inserts data sh2.write (1, 0, 'Zhang San') ) sh2.write (1,1, '2020-07-01, styleDate) sh2.write (1,2,90, styleNum) sh2.write (2,0,' Li Si') sh2.write (2,1, '2020-08-02') sh2.write (2,2,95) StyleNum) # set the format for centered cell contents alignment = xlwt.Alignment () alignment.horz = xlwt.Alignment.HORZ_CENTER style = xlwt.XFStyle () style.alignment = alignment # merge A4Magic B4 cells And set the content to center sh2.write_merge (3,3,0,1, 'total score', style) # through the formula, calculate the sum of C2+C3 cells and sh2.write (3,2, xlwt.Formula ("C2+C3")) # write data sh3.write (0,0, 'total score', styleBR) sh3.write (1,0185) # and finally save the file to wb.save ('test.xls').

Output result:

As you can see, using the code, we can set the font, color, alignment, merge and other normal Excel operations, and we can also format date and numeric data. Of course, only part of the functions are introduced here, but this is enough for our daily use. If you want to know more about the function operation, please refer to the official website.

Python-excel official website: www.python-excel.org/

Python operation Word

Install python-docx

You need to use the python-docx library to process Word. The current version is 0.8.10. Execute the following installation command:

$pip install python-docx # run result # C:\ Users\ Y > pip install python-docx Looking in indexes: https://pypi.doubanio.com/simple Collecting python-docx Downloading https://pypi.doubanio.com/packages/e4/83/c66a1934ed5ed8ab1dbb9931f1779079f8bca0f6bbc5793c06c4b5e7d671/python-docx-0.8.10.tar.gz (5.5MB) | ██ ██ | 5.5MB 3.2MB/s Requirement already satisfied: lxml > = 2.3.2 in c:\ users\ y\ appdata\ local\ programs\ python\ python37\ lib\ site-packages (from python-docx). Building wheels for collected packages: python-docx Building wheel for python-docx (setup.py). Done Created wheel for python-docx: filename=python_docx-0.8.10-cp37-none-any.whl size=184496 sha256=7ac76d3eec848a255b4f197d07e7b78ab33598c814d536d9b3c90b5a3e2a57fb Stored in directory: C:\ Users\ Y\ AppData\ Local\ pip\ Cache\ wheels\ 05\ 7d\ 71\ bb534b75918095724d0342119154c3d0fc035cedfe2f6c9a6c Successfully built python-docx Installing collected packages: python-docx Successfully installed python-docx-0.8.10 copy code

OK, if you are prompted for the above information, the installation is successful.

Write to Word

Usually when we operate Word to write documents, we are generally divided into several parts: headings, chapters, paragraphs, pictures, tables, references and bullet numbers. Let's follow these sections to introduce how to use Python operation.

Title

Creating a document title is relatively simple. Create a blank document through Document (), and you can create a title simply by calling the add_heading method.

# Import library from docx import Document from docx.shared import Pt from docx.shared import Inches from docx.oxml.ns import qn # New blank document doc1 = Document () # add document title doc1.add_heading ('how to create and manipulate Word',0 using Python) # Save file doc1.save (' word1.docx')

This completes the creation of the document and article title. Run the program below, and a document named word1.docx will be generated. The article will be opened and displayed as shown in the following figure:

Chapters and paragraphs

With the title of the article, let's take a look at how chapters and paragraphs work. The code to add chapters and paragraphs to the above code is as follows:

# Import library from docx import Document from docx.shared import Pt from docx.shared import Inches from docx.oxml.ns import qn # New blank document doc1 = Document () # add document title doc1.add_heading ('how to create and manipulate Word',0 using Python) # create paragraph description doc1.add_paragraph (' Word documents are often used in our life and work nowadays We usually use wps or office to deal with Word. We may not have thought that it can be generated by Python. Here's how to do it. # create a first-level title doc1.add_heading ('install the python-docx library', 1) # create a paragraph description doc1.add_paragraph ('now let's show you how to install the python-docx library You need to do the following two steps:') # create a secondary title doc1.add_heading ('step 1: install Python',2) # create a paragraph description doc1.add_paragraph (' download the python installation package on the python official website for installation.') # create a three-level heading doc1.add_heading ('step 2: install the python-docx library', 3) # create a paragraph description doc1.add_paragraph ('win+R type CMD under window to open the command line, type pip install python-docx to download.') # Save the file doc1.save ('word2.docx')

Above we said that the add_heading method is used to increase the title of the article, but from the above code we can know that the second parameter of this method is the number, in fact, this is used to indicate the level of title, in our usual use to mark the chapter. The add_paragraph method is used to add paragraphs to the article. Run the program to see the effect:

Fonts and references

Earlier we added three paragraphs through the add_paragraph method. Now let's take a look at how to manipulate the fonts in the paragraph and how to reference the paragraph. Continue to modify the above code to add font size, bold, tilt and other operations for the article, as follows:

# Import library from docx import Document from docx.shared import Pt from docx.shared import Inches from docx.oxml.ns import qn from docx.shared import RGBColor # New blank document doc1 = Document () # add document title doc1.add_heading ('how to create and manipulate Word',0 using Python) # create paragraph description doc1.add_paragraph (' Word documents are often used in our life and work nowadays We usually use wps or office to deal with Word. We may not have thought that it can be generated by Python. Here's how to do it. # create a first-level title doc1.add_heading ('install the python-docx library', 1) # create a paragraph description doc1.add_paragraph ('now let's show you how to install the python-docx library You need to do the following two steps:') # create a secondary title doc1.add_heading ('step 1: install Python',2) # create a paragraph description doc1.add_paragraph (' download the python installation package on the python official website for installation.') # create a three-level heading doc1.add_heading ('step 2: install the python-docx library', 3) # create a paragraph description doc1.add_paragraph ('win+R type CMD under window to open the command line, type pip install python-docx to download.') # create a paragraph and add the document content paragraph = doc1.add_paragraph ('this is the installation description for step 2!') # add text to the paragraph And set the font size run = paragraph.add_run ('(note: the font size is set to 20)') run.font.size = Pt (20) # set the English font run = doc1.add_paragraph ('set the English font here:). Add_run (' This Font is Times New Roman') run.font.name = 'Times New Roman' # set the Chinese font run = doc1.add_paragraph (' this Set the Chinese font in:'). Add_run ('current font is boldface') run.font.name=' boldface'r = run._element r.rPr.rFonts.set (qn ('wrun._element r.rPr.rFonts.set eastAsia') 'boldface') # set italics run = doc1.add_paragraph ('this setting:'). Add_run ('text is italic') run.italic = True # set bold run = doc1.add_paragraph ('this paragraph is set again:'). Add_run ('bold here'). Bold = True # set font with underscore run = doc1.add_paragraph ('this paragraph is underlined Line:'). Add_run ('underlined here'). Underline = True # sets the font color run = doc1.add_paragraph ('this font is red:'). Add_run ('here set the font to red') run.font.color.rgb = RGBColor (0xFF 0x00, 0x00) # add a quote to doc1.add_paragraph ('here is a quote from us: change your life with Python Change the world, FIGHTING. , style='Intense Quote') # Save file doc1.save ('word2.docx')

The above code is mainly for a variety of paragraph font settings, each code is marked with comments should be easy to understand, run the program to see the effect:

Project list

When we use Word, in order to show it more clearly, bullets and numbers are used to display the content in the form of a list. Let's create a new file word1.py and write the following code:

# Import library from docx import Document from docx.shared import Pt from docx.shared import Inches from docx.oxml.ns import qn # New document doc2 = Document () doc2.add_paragraph ('which is not an animal:') # add unordered list doc2.add_paragraph ('Apple', style='List Bullet') doc2.add_paragraph ('jubilant', style='List Bullet') doc2.add_paragraph ('lazy' Style='List Bullet') doc2.add_paragraph ('boiling', style='List Bullet') doc2.add_paragraph ('Grey Wolf', style='List Bullet') doc2.add_paragraph ('2020Plan:') # add ordered list doc2.add_paragraph ('CSDN reaches blogger', style='List Number') doc2.add_paragraph ('fitness three days a week' Style='List Number') doc2.add_paragraph ('learn a new knowledge point every day', style='List Number') doc2.add_paragraph ('learn 50 books', style='List Number') doc2.add_paragraph ('reduce overtime', style='List Number') # Save file doc2.save ('word1.docx')

Pictures and tables

We usually edit articles, insert pictures and tables are also often used, then how to use Python to insert pictures and tables? First of all, we randomly find a picture, I use the logo logo of Python, the file name is python-logo.png, use add_picture to add pictures; use add_table to add tables, and then add the following code to the word1.py file:

# Import library from docx import Document from docx.shared import Pt from docx.shared import Inches from docx.oxml.ns import qn # New document doc2 = Document () doc2.add_paragraph ('which is not an animal:') # add unordered list doc2.add_paragraph ('Apple', style='List Bullet') doc2.add_paragraph ('jubilant', style='List Bullet') doc2.add_paragraph ('lazy' Style='List Bullet') doc2.add_paragraph ('boiling', style='List Bullet') doc2.add_paragraph ('Grey Wolf', style='List Bullet') doc2.add_paragraph ('2020Plan:') # add ordered list doc2.add_paragraph ('CSDN reaches blogger', style='List Number') doc2.add_paragraph ('fitness three days a week' Style='List Number') doc2.add_paragraph ('learn a new knowledge point every day', style='List Number') doc2.add_paragraph ('learn 50 books', style='List Number') doc2.add_paragraph ('reduce overtime', style='List Number') doc2.add_heading ('pictures' 2) # add the image doc2.add_picture ('CGROGRAMUSUserGUSGUSERGUSERGUSERVER', width=Inches Pictures)) doc2.add_heading ('form', 2) # add the table This is the table header table = doc2.add_table (rows=1, cols=4) hdr_cells = table.rows [0] .cells hdr_cells [0] .text = 'number' hdr_cells [1] .text = 'name' hdr_cells [2] .text = 'occupation' # this is the table data records = ((1, 'Zhang San', 'electrician'), (2, 'Zhang Wu', 'boss'), (3) 'Ma Liu', 'IT'), (4,' Li Si', 'engineer') # iterate through the data and show for id, name Work in records: row_cells = table.add_row (). Cells row_cells [0] .text = str (id) row_cells [1] .text = name row_cells [2] .text = work # manually add paging doc2.add_page_break () # Save file doc2.save ('word1.docx')

Read Word file

It is written that many blank Word files are created with Python to format fonts and saved to the file. Next, let's briefly introduce how to read existing Word files. Please see the following code:

# Import library from docx import Document # Open document 1 doc1 = Document ('word1.docx') # read each segment pl = [paragraph.text for paragraph in doc1.paragraphs] print (' # output the content of word1 article #') # output the content read for i in pl: print (I) # Open document 2 doc2 = Document ('word2.docx' ) print ('\ n # output word2 article content #') pl2 = [paragraph.text for paragraph in doc2.paragraphs] # output read content for j in pl2: print (j) # read form material And output the result tables = [table for table in doc2.tables] for table in tables: for row in table.rows: for cell in row.cells: print (cell.text,end='') print () print ('\ n')

The above code reads out the contents of the two documents we output before. of course, it is just printed to the console and does not do any other processing. Now let's take a look at the results:

Python operation CSV

Brief introduction

CSV

CSV full name Comma-Separated Values, Chinese called comma-separated value or character-separated value, it stores table data (numbers and text) in plain text form, its essence is a character sequence, which can be composed of any number of records, records are separated by some kind of newline character, each record is composed of fields, usually all records have exactly the same field sequence, and commas or tabs are often used to separate fields. CSV file format is simple, universal, and has a wide range of applications in reality, in which the most commonly used is to transfer table data between programs.

CSV and Excel

Because CSV files and Excel files are opened with the Excel tool by default, what's the difference between them? Let's take a brief look at the table below.

CSVExcel file suffix is .csv file suffix is .xls or .xlsx plain text file binary file storage data does not contain formats, formulas, not only can store data, but also can operate on the data can be opened through the Excel tool Can also be opened through a text editor can only be opened through the Excel tool can only write column headings once each column in each row has a start tag and an end tag to import data when less memory is consumed, more memory is consumed

Basic use

Python uses the csv module to read and write data in CSV format files. This module provides the function of outputting and reading data files compatible with Excel, so that we do not need to know the details of the CSV format adopted by Excel. Similarly, it can also define CSV formats that are available or specific requirements of other applications.

The csv module uses the reader class and writer class to read and write serialized data, and the DictReader class and DictWriter class to read and write data in the form of a dictionary. Let's take a closer look at the corresponding functions. First, let's take a look at the constant information of the csv module, as follows:

Property description QUOTE_ALL instructs the writer object to quote all fields QUOTE_MINIMAL indicates that the writer object is only for fields that contain special characters (such as delimiters, quotation mark characters, line Terminator, etc.) QUOTE_NONNUMERIC indicates that the writer object quotes all non-numeric fields QUOTE_NONE indicates that the writer object does not use quotation marks to enclose fields

Writer (csvfile, dialect='excel', * * fmtparams)

Returns a writer object responsible for converting the user's data to a delimited string on a given file class object.

Csvfile can be any object with the write () method, and if csvfile is a file object, open it using newline=''

The optional parameter dialect is a specific set of parameters for different CSV variants

The optional keyword parameter fmtparams overrides a single formatting in the current variant format.

Look at the following example:

Import csv with open ('test.csv', 'walled, newline='') as csvfile: writer = csv.writer (csvfile) writer.writerow ([' id', 'name',' age']) # write multiple lines data = [('1001', 'Zhang San','21'), ('1002','Li Si','31')] writer.writerows (data)

Reader (csvfile, dialect='excel', * * fmtparams)

Returns a reader object that will traverse the csvfile,csvfile line by line. It can be a file object and a list object, and if it is a file object, it will be opened using newline=''. Look at the following example:

Import csv with open ('test.csv', newline='') as csvfile: reader = csv.reader (csvfile, delimiter='') for row in reader: print (', '.join (row))

Sniffer class

Used to infer the format of CSV files, this class provides the following two methods:

Sniff (sample, delimiters=None)

Parsing a given sample, if an optional delimiters parameter is given, is interpreted as a string that contains possible valid delimiters.

Has_header (sample)

Parses the sample text (assumed to be in CSV format) and returns True if the first row is likely to be a series of column headings.

The use of this class and methods is less, you can understand it. Let's take a brief look at it through an example.

Import csv with open ('test.csv', newline='') as csvfile: dialect = csv.Sniffer () .sniff (csvfile.read (1024)) csvfile.seek (0) reader = csv.reader (csvfile, dialect) for row in reader: print (row)

Reader object

The Reader object refers to the object returned by the DictReader instance and the reader () function. Let's take a look at its exposed properties and methods.

Next ()

Returns the next row of the iterable object of reader, and the return value may be a list or dictionary.

Dialect

Dialect description, read-only, for use by parsers.

Line_num

The number of rows that the source iterator has read.

Fieldnames

The name of the field, which is a DictReader object property.

Writer object

The Writer object refers to the object returned by the DictWriter instance and the writer () function. Let's take a look at its exposed properties and methods.

Writerow (row)

Writes the parameter row to the file object of writer.

Writerows (rows)

Writes all the elements in rows_ (that is, an iterator that can iterate over multiple of the above _ row objects) to the file object of writer.

Writeheader ()

In the file object of writer, write a line of field names, which is the DictWriter object method.

Dialect

Dialect description, read-only, for writer use.

Write-read append status

'Renewable: reading: writing: writing: append 'writeable' = = rroomw (readable and writable, error reported if the file does not exist (IOError)) 'Wordr (readable and writable, create if the file does not exist)' awritable'= = awrr (append writable, create if the file does not exist), if it is a binary file Just add a b: 'rb'' wb' 'ab'' rb+' 'wb+'' ab+ thank you for reading, the above is "how to use Python to operate Excel, Word, CSV" content, after the study of this article, I believe you on how to use Python to operate Excel, Word, CSV this problem has a deeper understanding, the specific use of the situation also needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report