How to use Python to realize OCR tool for Image character recognition 07/13 Update SLTechnology News&Howtos

How to use Python to realize OCR tool for Image character recognition

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly introduces "how to use Python to achieve image character recognition OCR tool". In daily operation, I believe that many people have doubts about how to use Python to achieve image character recognition OCR tool. Xiaobian consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful for you to answer the doubt of "how to use Python to achieve image character recognition OCR tool". Next, please follow the editor to study!

Introduction

Recently, in the technology exchange group, we talked about a need for image text recognition, which is often used in work and life, such as bills, comics, scanned pieces, and photo text extraction.

The blogger wrote a desktop-side OCR tool based on PyQt + PaddleOCR, which is used to quickly realize automatic text detection and text recognition in pictures.

The recognition effect is shown in the following figure:

All the selected areas are automatically detected by OCR algorithm, and the list on the right has the text content corresponding to each box.

Click the text record in the recognition result on the right, and then click copy to Clipboard to copy the text content.

Feature list

Text region detection + text recognition

Text area visualization

Text content list

Image, folder loading

Image scroll wheel zoom view

Draw area, edit area

Copy text recognition results

OCR part

The algorithm of image text detection and character recognition is mainly realized by paddleocr.

Create or select a virtual environment and install the third-party libraries you need.

Conda create-n ocrconda activate ocr

① installation framework

If you do not have NVIDIA GPU, or if GPU does not support CUDA, you can install the CPU version:

# CPU version pip install paddlepaddle==2.1.0-I https://mirror.baidu.com/pypi/simple

If you have CUDA9 or CUDA10,cuDNN 7.6 installed on your GPU, you can choose the following GPU version:

# GPU version python3-m pip install paddlepaddle-gpu==2.1.0-I https://mirror.baidu.com/pypi/simple

② install PaddleOCR

Install paddleocr:

Pip install "paddleocr > = 2.0.1" # version 2.0.1 + is recommended

Layout analysis, you need to install Layout-Parser:

Pip3 install-U https://paddleocr.bj.bcebos.com/whl/layoutparser-0.0.0-py3-none-any.whl

③ tests whether the installation is successful

After installation, test a picture-image_dir. / imgs/11.jpg, using the whole process of Chinese and English detection + direction classifier + recognition:

Paddleocr-- image_dir. / imgs/11.jpg-- use_angle_cls true-- use_gpu false

Output a list:

④ is called in python

The multilingual languages currently supported by from paddleocr import PaddleOCR and draw_ocr# Paddleocr can be switched by modifying the lang parameters # for example, `ch`, `en`, `fr`, `german`, `korean`, `japan`ocr = PaddleOCR (use_angle_cls=True, lang= "ch") # need to run only once to download and load model into memoryimg_path ='. / imgs/11.jpg'result = ocr.ocr (img_path, cls=True) for line in result: print (line)

The output is a list, and each item contains a text box, text, and identification confidence:

[24.0,36.0], [304.0], [304.0], [24.074.0]], ['Pure nutritional conditioner', 0.964739]]

[24.0,80.0], [172.0], [172.0, 104.0], [24.0, 104.0]], ['Product Information / parameters', 0.98069626]]

[109.0], [333.0, 109.0], [333.0, 136.0], [24.0, 136.0], [(45 yuan / kg, from 100kg)', 0.9676722]]

...

Interface part

The interface is implemented based on pyqt5.

Introduction to pyqt GUI program development and environment configuration, see this blog for details.

Main steps:

1. Interface layout design

Drag and drop controls in QtDesigner to complete the layout of the program interface, and save the * .ui file.

two。 Using pyuic to generate interface code automatically

Find the * .ui file in the pycharm project file structure, right-click-- External Tools--pyuic, and the python code for the interface ui will be automatically generated in the same directory of the ui file.

3. Write interface business classes

The business class MainWindow implements program logic and algorithm functions, which is decoupled from the ui implementation generated in step 2 above, so as to avoid affecting the business code each time the ui file is modified. Controls on the ui interface can be accessed through self._ui.xxxObjectName.

Class MainWindow (QMainWindow): FIT_WINDOW, FIT_WIDTH, MANUAL_ZOOM = 0,1,2 def _ _ init__ (self): super (). _ _ init__ () # call parent constructor Create QWidget form self._ui = Ui_MainWindow () # create ui object self._ui.setupUi (self) # construct ui self.setWindowTitle (_ _ appname__) # load default configuration config = get_config () self._config = config # Radio button group self.checkBtnGroup = QButtonGroup (self) self.checkBtnGroup.addButton (self._ui.checkBox_ocr) self.checkBtnGroup.addButton (self._ui.checkBox_det) self.checkBtnGroup.addButton (self._ui.checkBox_recog) self.checkBtnGroup.addButton (self._ui.checkBox_layoutparser) self.checkBtnGroup.setExclusive (True)

4. Implement interface business logic

Make a signal slot connection to the buttons, lists and drawing controls on the main interface. Custom slot functions do not need to be specifically declared. If they are custom signals, you need to add yourSignal= pyqtSignal (args) before the class _ _ init__ ().

Here, take button response function and list response function as examples. The signal that the button clicks is that the clicked,listWidget list toggles the selection signal is itemSelectionChanged.

# Button response function self._ui.btnOpenImg.clicked.connect (self.openFile) self._ui.btnOpenDir.clicked.connect (self.openDirDialog) self._ui.btnNext.clicked.connect (self.openNextImg) self._ui.btnPrev.clicked.connect (self.openPrevImg) self._ui.btnStartProcess.clicked.connect (self.startProcess) self._ui.btnCopyAll.clicked.connect (self.copyToClipboard) self._ui.btnSaveAll.clicked.connect (self.saveToFile) self._ui.listWidgetResults.itemSelectionChanged.connect (self.onItemResultClicked)

5. Run it to see the effect.

Run python main.py to start the GUI program.

Open Picture → Select language Model ch (Chinese) → Select text Detection + recognize → Click to start, the detected text area will be automatically framed and displayed in the list of text Tab page on the right.

A list of all the areas where text has been detected, on the Tab page of the recognition result-area:

At this point, the study on "how to use Python to achieve image text recognition OCR tool" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.