Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Java to read text and pictures in Word table

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "how to use Java to read Chinese text and pictures in Word form" related knowledge, editor through the actual case to show you the process of operation, the method of operation is simple and fast, practical, hope that this "how to use Java to read Chinese text and pictures in Word form" article can help you solve the problem.

1. Program environment preparation

Code compilation tool: IntelliJ IDEA

Jdk version: 1.8.0

Test documentation: Word .docx 2013

Jar package: free spire.doc.jar 3.9.0

The Word documents used for testing are as follows:

Jar import steps and methods: method 1: manual import.

Open the Project Structure (Shift+Ctrl+Alt+S) interface, select [Modules]-[Dependencies], click "+", [JARs or directories... ], select the jar package in the local path, add it, check it, and click "OK" or "Apply" to import jar.

Method 2:Maven warehouse import.

Configure the maven path in the pom.xml file and specify the dependencies for free spire.doc.jar 3.9.0, and then download and import. The specific configuration is as follows:

Com.e-iceblue http://repo.e-iceblue.cn/repository/maven-public/ e-iceblue free.spire.doc 3.9.0 2. Java Code import com.spire.doc.*;import com.spire.doc.documents.Paragraph;import com.spire.doc.fields.DocPicture;import com.spire.doc.interfaces.ITable;import javax.imageio.ImageIO;import java.awt.image.RenderedImage;import java.io.BufferedWriter Import java.io.File;import java.io.FileWriter;import java.io.IOException;import java.util.ArrayList;import java.util.List;public class GetTable {public static void main (String [] args) throws IOException {/ / load Word test document Document doc = new Document (); doc.loadFromFile ("inputfile.docx"); / / get the first section Section section = doc.getSections () .get (0) / / get the first table ITable table = section.getTables () .get (0); / / create a txt file (for writing text extracted in the table) String output = "ReadTextFromTable.txt"; File textfile = new File (output); if (textfile.exists ()) {textfile.delete ();} textfile.createNewFile () FileWriter fw = new FileWriter (textfile, true); BufferedWriter bw = new BufferedWriter (fw); / / create List List images = new ArrayList (); / / traverse the row for in the table (int I = 0; I < table.getRows (). GetCount ()) {TableRow row = table.getRows (). Get (I) / / iterate through the cell for in each row (int j = 0; j < row.getCells (). GetCount ()) {TableCell cell = row.getCells (). Get (j); / / traverse the paragraph for in the cell (int k = 0; k < cell.getParagraphs (). GetCount ()) Cell.getParagraphs +) {Paragraph paragraph = cell.getParagraphs (). Get (k); bw.write (paragraph.getText () + "\ t"); / / get text content / / all sub-objects in the traversal paragraph for (int x = 0; x < paragraph.getChildObjects (). GetCount ()) X Object object +) {Object object = paragraph.getChildObjects () .get (x) / / determine whether the object is a picture if (object instanceof DocPicture) {/ / get an image DocPicture picture = (DocPicture) object; images.add (picture.getImage ()) } bw.write ("\ r\ n"); / / write to the txt file} bw.flush (); bw.close (); fw.close (); / / Save the picture in PNG file format for (int z = 0) Z < images.size (); PNG +) {File imagefile = new File (String.format ("extracted form picture -% d.png", z)); ImageIO.write ((RenderedImage) images.get (z), "PNG", imagefile);}} 3. Reading effect of text and picture

After editing the code, execute the program to read the text data and pictures in the table. The file path in the code is the IDEA project folder path, such as:

C:\ Users\ Administrator\ IdeaProjects\ Table_Doc\ ReadTextFromTable.txt

C:\ Users\ Administrator\ IdeaProjects\ Table_Doc\ extracted form picture-0.png

C:\ Users\ Administrator\ IdeaProjects\ Table_Doc\ inputfile.docx

In the code, the file path can be customized to another path.

The result of reading text data:

The result of reading the picture:

This is the end of the content about "how to use Java to read Chinese text and pictures in Word tables". Thank you for your reading. If you want to know more about the industry, you can follow the industry information channel. The editor will update different knowledge points for you every day.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report