Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use JAVA to read WORD containing tables

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article shows you how to use JAVA to read the WORD containing table, the content is concise and easy to understand, can definitely brighten your eyes, through the detailed introduction of this article, I hope you can get something.

Business requirements

We have such a requirement that we need to extract the content from the WORD document, assemble it into a specific json format and send it to the third-party engine interface. The input protocol is as follows:

{"tables": [{"cells": [{"col": 1, "row_span": 1, "row": 1, "col_span": 1 "content": "vehicle name"}], "id": 0, "row_num": 2}], "paragraps": [{"para_id": 1, "content": "Hello, Java Daily Records"}]}

At first glance, this input format requires us to read the contents of the word in paragraphs and tables. Now that the requirements are set, let's start writing code directly.

Implementation based on POI

Take "java how to read word" to Baidu to search, the answer is basically to use POI to achieve. Of course, it is possible to extract content by paragraph and table and assemble it into the above format with POI, but there are two problems in practice:

We need to deal with the two formats docx and docPOI using different API to read docx and doc, so we need to write the reading logic twice.

POI will also read out the contents of the table when reading paragraphs in doc. Poi has a separate method to read all the tables in the document, but it will also read the contents of the table when reading the paragraph document in doc format, so we need to exclude the table with the following methods:

/ / read docHWPFDocument doc = new HWPFDocument (stream); Range range = doc.getRange (); / / read paragraph int num = range.numParagraphs (); Paragraph para;for (int iTuno; I

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report