The recognition method of Java Optical characters 02/13 Update SLTechnology News&Howtos

The recognition method of Java Optical characters

2026-02-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article introduces the relevant knowledge of "Java optical character recognition". Many people will encounter such a dilemma in the operation of actual cases, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1.1 introduction

The development of symbols with certain value is a unique feature of human beings. It is very normal for people to recognize these symbols and understand the words on the pictures. Unlike computers grabbing words, we read them entirely based on visual instinct.

On the other hand, the work of computers requires specific and organized content. They need a digital representation, not a graphical one.

Sometimes, this is impossible. Sometimes we want to automate the task of rewriting text from an image with both hands.

For these tasks, Optical character recognition (OCR) is designed to allow computers to "read" graphical content in the form of text, similar to the way humans work. Although these systems are relatively accurate, there may still be considerable deviations. Even so, it is much easier and faster to fix system errors than to start from scratch manually.

Like all systems, they are similar in nature, and optical character recognition software is trained on prepared data sets that provide enough data to help learn the differences between characters. If we want to make the results more accurate, then how to learn these software is also a very important topic, but this will be the content of another article.

Instead of reinventing the wheel or coming up with a very complex (but useful) solution, let's sit down and look at the existing solution.

1.2 Tesseract

The technology giant Google has been developing an OCR engine, Tesseract, which has a history of several decades since its inception. It provides API for many languages, but we will focus on Tesseract's Java API.

It's easy to use Tesseract to implement a simple function. It is mainly used to read the text generated by the computer on black-and-white pictures, and the accuracy of the results is good. But this is not a real-world text.

For the real world, we'd better use more advanced optical character recognition software like Google Vision, which will be discussed in another article.

1.2.1 Maven dependency

We simply need to add a dependency to introduce the engine into our project:

Net.sourceforge.tess4j tess4j 3.2.11.2.2 Optical character recognition

Using Tesseract is effortless:

Tesseract tesseract = new Tesseract (); tesseract.setDatapath ("E://DataScience//tessdata"); System.out.println (tesseract.doOCR (new File ("..."))

We first instantiate a Tesseract instance, and then set the data path for the trained LSTM (long-term and short-term memory network) model.

The data can be downloaded from the official GitHub account.

Then we call the doOCR () method, which takes a file parameter and returns a string-- the extracted content.

Let's give it a white background picture with large and clear black characters:

Providing such a picture will get perfect results:

Optical Character Recognition in Java is made easy with the help of Tesseract'

But this picture is too simple to scan. It has been normalized and has high resolution and consistent fonts.

Let's try to handwrite some characters on paper and provide the picture to the application. What will happen?

We can see the change in the result immediately:

A411 ", written texz: is different {mm compatar generated but

There are some words that are very accurate, and you can easily recognize "written text is different from computer generated", but the first word is a little different from the last word.

Now, to make the program easier to use, let's convert it into a very simple Spring Boot application, showing the results with a more comfortable graphical interface.

1.3 implement 1.3.1 Spring Boot application

First, let's start by creating our project using Spring Initializr. It contains spring-boot-starter-web and spring-boot-starter-thymeleaf dependencies. Then we manually import Tesseract:

1.3.2 Controller

The application needs only one controller, which will provide us with two pages of display, image upload and optical character recognition:

@ Controllerpublic class FileUploadController {@ RequestMapping ("/") public String index () {return "upload";} @ RequestMapping (value = "/ upload", method = RequestMethod.POST) public RedirectView singleFileUpload (@ RequestParam ("file") MultipartFile file, RedirectAttributes redirectAttributes, Model model) throws IOException, TesseractException {byte [] bytes = file.getBytes () Path path = Paths.get ("E://simpleocr//src//main//resources//static//" + file.getOriginalFilename ()); Files.write (path, bytes); File convFile = convert (file); Tesseract tesseract = new Tesseract (); tesseract.setDatapath ("E://DataScience//tessdata"); String text = tesseract.doOCR (convFile); redirectAttributes.addFlashAttribute ("file", file) RedirectAttributes.addFlashAttribute ("text", text); return new RedirectView ("result");} @ RequestMapping ("/ result") public String result () {return "result";} public static File convert (MultipartFile file) throws IOException {File convFile = new File (file.getOriginalFilename ()); convFile.createNewFile (); FileOutputStream fos = new FileOutputStream (convFile); fos.write (file.getBytes ()) Fos.close (); return convFile;}}

Tesseract works with Java's File class, but does not support form-uploaded MultipartFile classes. For ease of processing, we added a simple convert () method that converts a MultipartFile object into a normal File object.

Once we have extracted the text using Tesseract, we just need to add the text to the model along with the scanned image, and then attach it to the redirected display page-result.

1.3.3 display page

Now, let's define a display page that contains a simple file upload form:

Upload a file for OCR:

And a results page:

Extracted Content: >

From the image:

Running this application will have a simple interactive interface to greet us:

Add an image and submit it, and the results on the screen will contain the extracted text and the uploaded image:

Succeed!

1.4 conclusion

Using Google's Tesseract engine, we built a very simple application that accepts images submitted from the form, extracts text from them, and returns the results to us along with the images.

Since we only use the limited capabilities of Tesseract, this is not a particularly useful application. And the application is too simple for anything other than demonstration purposes, but it can be implemented and tested as an interesting tool.

This is the end of the content of "Java Optical character recognition". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.