In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces the relevant knowledge of "how to operate Excel,Word,CSV with Python". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
Common tools for operating Excel with Python
Data processing is a major application scenario of Python, and Excel is the most popular data processing software at present. Therefore, when using Python for data processing, it is easy to deal with Excel. Thanks to the hard work of our predecessors, Python has many ready-made wheels for dealing with Excel, such as xlrd & xlwt & xlutils, XlsxWriter, OpenPyXL, and the open interface of Microsoft Excel can be called directly on the Windows platform. These are common tools, and some other excellent tools will not be introduced here. Next, we will show the characteristics of each tool through a table:
Type xlrd&xlwt&xlutilsXlsxWriterOpenPyXLExcel open interface read support write support modify whether support xls support do not support high version of xlsx support large files do not support efficiency fast fast super slow function is weak and generally super powerful
The above can choose the appropriate tools according to different needs. Now we mainly introduce the use of the most commonly used xlrd & xlwt & xlutils series tools.
Introduction to xlrd & xlwt & xlutils
Xlrd&xlwt&xlutils consists of the following three libraries:
Xlrd: for reading Excel files
Xlwt: for writing to Excel files
Xlutils: utilities for manipulating Excel files, such as copying, splitting, filtering, etc.
Installation library
The installation is relatively simple, you can install three libraries directly with the pip tool. The installation commands are as follows:
$pip install xlrd xlwt xlutils
Write to Excel
Next, let's start by writing Excel, and let's just look at the code as follows:
# Import xlwt library import xlwt# to create xls file object wb = xlwt.Workbook () # add two new form pages sh2 = wb.add_sheet ('grade') sh3 = wb.add_sheet ('summary') # then add data by location. The first parameter is line The second parameter is column # to write the first sheetsh2.write (0,0, 'name') sh2.write (0,1, 'professional') sh2.write (0,2, 'subjects') sh2.write (0,3, 'grades') sh2.write (1,0, 'Zhang San') sh2.write (1,1, 'information and communication engineering') sh2.write (1,2, 'numerical analysis') sh2.write (1,3 88) sh2.write (2,0,'Li Si') sh2.write (2,1, 'Internet of things Engineering') sh2.write (2,2, 'Digital signal processing Analysis') sh2.write (2,3,95) sh2.write (3,0, 'Wang Hua') sh2.write (3,1, 'Electronics and Communication Engineering') sh2.write (3,2, 'Fuzzy Mathematics') sh2.write (3,3 90) # write the second sheetsh3.write (0,0, 'total score') sh3.write (1,0273) # finally save the file to wb.save ('test.xls')
Run the code, and you will see that an Excel file named test.xls is generated. Open the file and see as shown in the following figure:
The above is the code to write Excel, is it very simple, let's take a look at how to read Excel.
Read Excel
It is not difficult to read Excel. Please see the following code:
# Import xlrd library import xlrd#, open the test_w.xls file wb = xlrd.open_workbook ("test_w.xls") we just wrote # get and print the number of sheet print ("sheet quantity:", wb.nsheets) # get and print the sheet name print ("sheet name:" Wb.sheet_names () # get the content according to the sheet index sh2 = wb.sheet_by_index (0) # or # you can also get the content according to the sheet name # sh = wb.sheet_by_name ('grades') # get and print the sheet rows and columns print (u "sheet% s% d rows% d columns"% (sh2.name, sh2.nrows) Sh2.ncols) # get and print the value of a cell print ("the value of the first row and the second column is:", sh2.cell_value (0,1)) # get the value of the whole row or column rows = sh2.row_values (0) # get the first row content cols = sh2.col_values (1) # get the second column content # print the row column value print ("the value of the first row is:" Rows) print ("the value of the second column is:", cols) # gets the data type of the cell content print ("the value type of the second row and the first column is:", sh2.cell (1,0) .ctype) # iterates through all the form contents for sh in wb.sheets (): for rin range (sh.nrows): # outputs the specified line print (sh.row (r))
Output the following results:
A careful friend may notice that here we can get the type of the cell. When we read the type above, we get the number 1. What type does that 1 represent and what type does it have? Don't worry, let's show it through a table:
Numeric type description 0empty empty 1string string 2number numeric 3date date 4boolean Boolean value 5error error
From the table above, we can see that the number 1 returned by the cell type just got corresponds to the string type.
Modify excel
The above said to write and read Excel content, and then we will talk about how to update and modify the Excel, which requires the methods in the xlutils. Go straight to the code and take a look at the simplest modification:
# Import the corresponding module import xlrdfrom xlutils.copy import copy# and open the excel file readbook = xlrd.open_workbook ("test_w.xls") # copy a copy of wb = copy (readbook) # Select the first form sh2 = wb.get_sheet (0) # add write data sh2.write (4,0, 'Wang Huan') sh2.write (4, 1, 'Communication Engineering') sh2.write (4, 2) in the fifth line Machine Learning) sh2.write (4,3,89) # Select the second form sh2 = wb.get_sheet (1) # replace the total score data sh2.write (1,0,362) # Save wb.save ('test.xls')
As can be seen from the above code, the modified Excel here is to make a copy of the original Excel through the copy method of the xlutils library, and then modify it, and then save it. Take a look at the modification results as follows:
Format conversion operation
In normal times, when we use Excel, we will format or style the data. Here, simply modify the code described above to change the format of the output slightly. The code is as follows:
# Import xlwt library import xlwt# set write format red bold styleBR = xlwt.easyxf ('font: name Times New Roman, color-index red, bold on') # set numeric format to decimal point and retain two digits styleNum = xlwt.easyxf (num_format_str='# # # 0.00') # set date format display as YYYY-MM-DDstyleDate = xlwt.easyxf (num_format_str='YYYY-MM-DD') # create xls file object wb = xlwt.Workbook () # add two new form pages sh2 = wb.add_sheet ('grade') sh3 = wb.add_sheet ('summary') # then add data by location, the first parameter is the line The second parameter is the column sh2.write (0,0, 'name', styleBR) # set the header font to red bold sh2.write (0,1, 'date', styleBR) # set the header font to red bold sh2.write (0,2, 'grade', styleBR) # set the header font to red and bold # insert data sh2.write (1, 0, 'Zhang San',) sh2.write (1, 1 '2020-07-01, styleDate) sh2.write (1, 2, 90, styleNum) sh2.write (2, 0,' Li Si') sh2.write (2, 1, '2020-08-02') sh2.write (2, 2, 95, styleNum) # set the format for centered cell content alignment = xlwt.Alignment () alignment.horz = xlwt.XFStyle () style.alignment = alignment# merge A4PowerB4 cell And set the content to center sh2.write_merge (3,3,0,1, 'total score', style) # through the formula, calculate the sum of C2+C3 cells and sh2.write (3,2, xlwt.Formula ("C2+C3")) # write data sh3.write (0,0, 'total score', styleBR) sh3.write (1,0185) # and finally save the file to wb.save ('test.xls').
Output result:
As you can see, using the code, we can set the font, color, alignment, merge and other normal Excel operations, and we can also format date and numeric data. Of course, only part of the functions are introduced here, but this is enough for our daily use. If you want to know more about the function operation, please refer to the official website.
Python-excel official website: www.python-excel.org/
Python operation Word installation python-docx
You need to use the python-docx library to process Word. The current version is 0.8.10. Execute the following installation command:
$pip install python-docx# run result # C:\ Users\ Y > pip install python-docxLooking in indexes: https://pypi.doubanio.com/simpleCollecting python-docx Downloading https://pypi.doubanio.com/packages/e4/83/c66a1934ed5ed8ab1dbb9931f1779079f8bca0f6bbc5793c06c4b5e7d671/python-docx-0.8.10.tar.gz (5.5MB) | █ ████ | 5.5MB 3.2MB/sRequirement already satisfied: lxml > = 2.3.2 in c:\ users\ y\ appdata\ local\ programs\ python\ python37\ lib\ site-packages (from python-docx). Building wheels for collected packages: python-docx Building wheel for python-docx (setup.py). Done Created wheel for python-docx: filename=python_docx-0.8.10-cp37-none-any.whl size=184496 sha256=7ac76d3eec848a255b4f197d07e7b78ab33598c814d536d9b3c90b5a3e2a57fb Stored in directory: C:\ Users\ Y\ AppData\ Local\ pip\ Cache\ wheels\ 05\ 7d\ 71\ bb534b75918095724d0342119154c3d0fc035cedfe2f6c9a6cSuccessfully built python-docxInstalling collected packages: python-docxSuccessfully installed python-docx-0.8.10
OK, if you are prompted for the above information, the installation is successful.
Write to Word
Usually when we operate Word to write documents, we are generally divided into several parts: headings, chapters, paragraphs, pictures, tables, references and bullet numbers. Let's follow these sections to introduce how to use Python operation.
Title
Creating a document title is relatively simple. Create a blank document through Document (), and you can create a title simply by calling the add_heading method.
# Import library from docx import Documentfrom docx.shared import Ptfrom docx.shared import Inchesfrom docx.oxml.ns import qn# New Blank document doc1 = Document () # add document title doc1.add_heading ('how to create and operate Word',0 using Python) # Save file doc1.save (' word1.docx')
This completes the creation of the document and article title. Run the program below, and a document named word1.docx will be generated. The article will be opened and displayed as shown in the following figure:
Chapters and paragraphs
With the title of the article, let's take a look at how chapters and paragraphs work. The code to add chapters and paragraphs to the above code is as follows:
# Import library from docx import Documentfrom docx.shared import Ptfrom docx.shared import Inchesfrom docx.oxml.ns import qn# new blank document doc1 = Document () # add document title doc1.add_heading ('how to create and manipulate Word',0 using Python) # create paragraph description doc1.add_paragraph (' Word documents are often used in our life and work now, we usually use wps or office to deal with Word It may not have been thought that it can be generated with Python, so let's describe how to do it.') # create a first-level heading doc1.add_heading ('install the python-docx library', 1) # create a paragraph description doc1.add_paragraph ('now let's introduce how to install the python-docx library, which requires the following two steps:') # create a secondary title doc1.add_heading ('step 1: install Python',2) # create a paragraph description doc1.add_paragraph (' download the python installation package on the python official website for installation.') # create a three-level heading doc1.add_heading ('step 2: install the python-docx library', 3) # create a paragraph description doc1.add_paragraph ('win+R type CMD under window to open the command line, type pip install python-docx to download.') # Save the file doc1.save ('word2.docx')
Above we said that the add_heading method is used to increase the title of the article, but from the above code we can know that the second parameter of this method is the number, in fact, this is used to indicate the level of title, in our usual use to mark the chapter. The add_paragraph method is used to add paragraphs to the article. Run the program to see the effect:
Fonts and references
Earlier we added three paragraphs through the add_paragraph method. Now let's take a look at how to manipulate the fonts in the paragraph and how to reference the paragraph. Continue to modify the above code to add font size, bold, tilt and other operations for the article, as follows:
# Import library from docx import Documentfrom docx.shared import Ptfrom docx.shared import Inchesfrom docx.oxml.ns import qnfrom docx.shared import RGBColor# new blank document doc1 = Document () # add document title doc1.add_heading ('how to create and manipulate Word',0 using Python) # create paragraph description doc1.add_paragraph (' Word documents are often used in our life and work now, we usually use wps or office to deal with Word It may not have been thought that it can be generated with Python, so let's describe how to do it.') # create a first-level heading doc1.add_heading ('install the python-docx library', 1) # create a paragraph description doc1.add_paragraph ('now let's introduce how to install the python-docx library, which requires the following two steps:') # create a secondary title doc1.add_heading ('step 1: install Python',2) # create a paragraph description doc1.add_paragraph (' download the python installation package on the python official website for installation.') # create a three-level heading doc1.add_heading ('step 2: install the python-docx library', 3) # create a paragraph description doc1.add_paragraph ('win+R type CMD under window to open the command line, type pip install python-docx to download.') # create a paragraph and add the document content paragraph = doc1.add_paragraph ('this is the installation description for step 2!') # add text to the paragraph And set font size run = paragraph.add_run ('(note: font size is 20)') run.font.size = Pt (20) # set English font run = doc1.add_paragraph ('set English font:'). Add_run ('This Font is Times New Roman') run.font.name =' Times New Roman'# set Chinese font run = doc1.add_paragraph ('set Chinese font here :'). Add_run ('the current font is boldface') run.font.name=' boldface'r = run._elementr.rPr.rFonts.set (qn ('wrun._elementr.rPr.rFonts.set') 'boldface') # set italic run = doc1.add_paragraph ('this setting:'). Add_run ('text is italic') run.italic = True# set bold run = doc1.add_paragraph ('this section is set again:'). Add_run ('bold here'). Bold = True# sets the font with an underscore run = doc1.add_paragraph ('this paragraph is underlined:'). Add_ Run ('underlined here'). Underline = True# set font color run = doc1.add_paragraph ('this font is red:'). Add_run ('set red here') run.font.color.rgb = RGBColor (0xFF 0x00, 0x00) # add a quote to doc1.add_paragraph ('here is a quote from us: change your life with Python Change the world, FIGHTING. , style='Intense Quote') # Save file doc1.save ('word2.docx')
The above code is mainly for a variety of paragraph font settings, each code is marked with comments should be easy to understand, run the program to see the effect:
Project list
When we use Word, in order to show it more clearly, bullets and numbers are used to display the content in the form of a list. Let's create a new file word1.py and write the following code:
# Import library from docx import Documentfrom docx.shared import Ptfrom docx.shared import Inchesfrom docx.oxml.ns import qn# new document doc2 = Document () doc2.add_paragraph ('which is not an animal:') # add unordered list doc2.add_paragraph ('Apple', style='List Bullet') doc2.add_paragraph ('jubilant', style='List Bullet') doc2.add_paragraph ('lazy' Style='List Bullet') doc2.add_paragraph ('boiling', style='List Bullet') doc2.add_paragraph ('Grey Wolf', style='List Bullet') doc2.add_paragraph ('2020Plan:') # add ordered list doc2.add_paragraph ('CSDN reaches blogger', style='List Number') doc2.add_paragraph ('fitness three days a week' Style='List Number') doc2.add_paragraph ('learn a new knowledge point every day', style='List Number') doc2.add_paragraph ('learn 50 books', style='List Number') doc2.add_paragraph ('reduce overtime', style='List Number') # Save file doc2.save ('word1.docx')
Pictures and tables
We usually edit articles, insert pictures and tables are also often used, then how to use Python to insert pictures and tables? First of all, we randomly find a picture, I use the logo logo of Python, the file name is python-logo.png, use add_picture to add pictures; use add_table to add tables, and then add the following code to the word1.py file:
# Import library from docx import Documentfrom docx.shared import Ptfrom docx.shared import Inchesfrom docx.oxml.ns import qn# new document doc2 = Document () doc2.add_paragraph ('which is not an animal:') # add unordered list doc2.add_paragraph ('Apple', style='List Bullet') doc2.add_paragraph ('jubilant', style='List Bullet') doc2.add_paragraph ('lazy' Style='List Bullet') doc2.add_paragraph ('boiling', style='List Bullet') doc2.add_paragraph ('Grey Wolf', style='List Bullet') doc2.add_paragraph ('2020Plan:') # add ordered list doc2.add_paragraph ('CSDN reaches blogger', style='List Number') doc2.add_paragraph ('fitness three days a week' Style='List Number') doc2.add_paragraph ('learn a new knowledge point every day', style='List Number') doc2.add_paragraph ('learn 50 books', style='List Number') doc2.add_paragraph ('reduce overtime', style='List Number') doc2.add_heading ('Pictures', 2) # add image doc2.add_picture ('CreguGar UsersGUserGo. Png' Width=Inches (5.5) doc2.add_heading ('form', 2) # add form This is the header table = doc2.add_table (rows=1, cols=4) hdr_cells = table.rows [0] .cellshdr _ cells [0] .text = 'number' hdr_cells [1] .text = 'name' hdr_cells [2] .text = 'occupation' this is tabular data records = ((1, 'Zhang San', 'electrician'), (2, 'Zhang Wu', 'boss'), (3,'Ma Liu', 'IT') (4,'Li Si', 'engineer') # iterate through the data and show for id, name, work in records: row_cells = table.add_row (). Cells row_cells [0] .text = str (id) row_cells [1] .text = name row_cells [2] .text = work# manually add paging doc2.add_page_break () # Save file doc2.save ('word1.docx')
Read Word file
It is written that many blank Word files are created with Python to format fonts and saved to the file. Next, let's briefly introduce how to read the existing Word files. Please see the following code:
# Import the library from docx import Document# to open the document 1doc1 = Document ('word1.docx') # read each segment pl = [paragraph.text for paragraph in doc1.paragraphs] print (' # output the content of the word1 article #') # output the content read for i in pl: print (I) # Open the document 2doc2 = Document ('word2.docx') print ('\ n # # output word2 article content #') pl2 = [paragraph.text for paragraph in doc2.paragraphs] # output read content for j in pl2: print (j) # read form material And output the result tables = [table for table in doc2.tables] for table in tables: for row in table.rows: for cell in row.cells: print (cell.text,end='') print () print ('\ n')
The above code reads out the contents of the two documents we output before. of course, it is just printed to the console and does not do any other processing. Now let's take a look at the results:
Introduction to Python Operation CSV
CSV
CSV full name Comma-Separated Values, Chinese called comma-separated value or character-separated value, it stores table data (numbers and text) in plain text form, its essence is a character sequence, which can be composed of any number of records, records are separated by some kind of newline character, each record is composed of fields, usually all records have exactly the same field sequence, and commas or tabs are often used to separate fields. CSV file format is simple, universal, and has a wide range of applications in reality, in which the most commonly used is to transfer table data between programs.
CSV and Excel
Because CSV files and Excel files are opened with the Excel tool by default, what's the difference between them? Let's take a brief look at the table below.
Basic use
Python uses the csv module to read and write data in CSV format files. This module provides the function of outputting and reading data files compatible with Excel, so that we do not need to know the details of the CSV format adopted by Excel. Similarly, it can also define CSV formats that are available or specific requirements of other applications.
The csv module uses the reader class and writer class to read and write serialized data, and the DictReader class and DictWriter class to read and write data in the form of a dictionary. Let's take a closer look at the corresponding functions. First, let's take a look at the constant information of the csv module, as follows:
Writer (csvfile, dialect='excel', * * fmtparams)
Returns a writer object responsible for converting the user's data to a delimited string on a given file class object.
Csvfile can be any object with the write () method, and if csvfile is a file object, open it using newline=''
The optional parameter dialect is a specific set of parameters for different CSV variants
The optional keyword parameter fmtparams overrides a single formatting in the current variant format.
Look at the following example:
Import csvwith open ('test.csv', 'walled, newline='') as csvfile: writer = csv.writer (csvfile) writer.writerow ([' id', 'name',' age']) # write multiple lines data = [('1001', 'Zhang San','21'), ('1002','Li Si','31')] writer.writerows (data)
Reader (csvfile, dialect='excel', * * fmtparams)
Returns a reader object that will traverse the csvfile,csvfile line by line. It can be a file object and a list object, and if it is a file object, it will be opened using newline=''. Look at the following example:
Import csvwith open ('test.csv', newline='') as csvfile: reader = csv.reader (csvfile, delimiter='') for row in reader: print (', '.join (row))
Sniffer class
Used to infer the format of CSV files, this class provides the following two methods:
Sniff (sample, delimiters=None)
Parsing a given sample, if an optional delimiters parameter is given, is interpreted as a string that contains possible valid delimiters.
Has_header (sample)
Parses the sample text (assumed to be in CSV format) and returns True if the first row is likely to be a series of column headings.
The use of this class and methods is less, you can understand it. Let's take a brief look at it through an example.
Import csvwith open ('test.csv', newline='') as csvfile: dialect = csv.Sniffer () .sniff (csvfile.read (1024)) csvfile.seek (0) reader = csv.reader (csvfile, dialect) for row in reader: print (row)
Reader object
The Reader object refers to the object returned by the DictReader instance and the reader () function. Let's take a look at its exposed properties and methods.
Next ()
Returns the next row of the iterable object of reader, and the return value may be a list or dictionary.
Dialect
Dialect description, read-only, for use by parsers.
Line_num
The number of rows that the source iterator has read.
Fieldnames
The name of the field, which is a DictReader object property.
Writer object
The Writer object refers to the object returned by the DictWriter instance and the writer () function. Let's take a look at its exposed properties and methods.
Writerow (row)
Writes the parameter row to the file object of writer.
Writerows (rows)
Write all the elements in rows* (that is, iterators that can iterate over multiple of the above * row objects) to the file object of writer.
Writeheader ()
In the file object of writer, write a line of field names, which is the DictWriter object method.
Dialect
Dialect description, read-only, for writer use.
Write-read append status
'ringing: read
'wicked: write
'asides: append
'rust'= = rroomw (readable and writable, IOError if the file does not exist)
'Wendy'= = Whiter (readable and writable, create if the file does not exist)
'await'= awrr (appendable and writable, create a file if it doesn't exist)
Correspondingly, if it is a binary file, just add a b:
'rb' 'wb'' ab' 'rb+'' wb+' 'ab+'
This is the end of the content of "how to operate Excel,Word,CSV with Python". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.